Everninomicin biosynthetic genes

ABSTRACT

This invention is directed to nucleic acids which encode the proteins that direct the synthesis of the orthosomycin everninomicin and to use of the nucleic acids and proteins to produce compounds exhibiting antibiotic activity based on the everninomycin structure. The DNA sequence for the gene clusters responsible for encoding everninomicin biosynthetic genes, which provide the machinery for producing everninomicin, are provided. Thus, this invention provides the nucleic acid sequences needed to synthesize novel everninomicin-related compounds based on everninomicin, arising from modifications of the DNA sequence designed to change glycosyl and modified orsellinic acid groups contained in everninomicin. A  Micromonospora  site-specific integrase gene is also provided, which can be incorporated in a vector for integration into any actinomycete, and, particularly into  Monospora.  Thus, the invention further provides methods for introducing heterologous genes into an actinomycete chromosome using this particular vector.

This application is a continuation of U.S. patent application Ser. No.12/875,342; filed Sep. 3, 2010; which is a continuation of U.S. patentapplication Ser. No. 11/739,945; filed Apr. 25, 2007 which is acontinuation of U.S. patent application Ser. No. 11/021,825; filed Dec.23, 2004, now U.S. Pat. No. 7,229,813; which is a divisional applicationof U.S. patent application Ser. No. 09/758,759; filed Jan. 11, 2001, nowU.S. Pat. No. 6,861,513, which claims the benefit of U.S. ProvisionalPatent Application No. 60/175,751; filed Jan. 12, 2000 each of which isherein incorporated by reference in its entirety.

FIELD OF THE INVENTION

This invention is directed to nucleic acid molecules which encodeproteins that direct the synthesis of the orthosomycin everninomicin.The present invention also is directed to use of DNA to producecompounds exhibiting antibiotic activity based on the everninomycinstructure.

BACKGROUND OF THE INVENTION Everninomicin Biosynthesis

Everninomicin is an oligosaccharide antibiotic belonging to theorthosomycin group of antibiotics produced by Micromonospora carbonaceavar. africana (ATCC 39149, SCC 1413) and is useful as a human medicine.Everninomicin chemically consists of several glycosyl residues attachedto modified orsellinic acid. Everninomicin's antibiotic activity isbelieved to be due to its inhibition of protein synthesis by a mechanismthat involves binding of the antibiotic to a ribosome (McNicholas etal., Abstract C-846, ICAAC, San Francisco, Calif., 1999). Everninomicinis structurally similar to the antibiotic avilamycin produced byStreptomyces viridochromogenes Tu57.

The biosynthesis and enzymatic steps necessary for synthesis of homologsof the chemical moieties contained in the everninomicin structure havebeen studied in other systems. These include synthesis of orsellinicacid (Type I polyketide), glycosyl group synthesis (deoxysugars), andglycosyltransferase responsible for covalent attachment of glycosylgroups. Orsellinic acid biosynthesis in Penicillium patulum andStreptomyces viridochromogenes Tu57 has been investigated (Beck et al.,European Journal of Biochemistry, 1990, 192:487-498; and Gaisser et al.,Journal of Bacteriology, 1997, 179:6271-6278). Glycosyl biosynthesis hasbeen reviewed (Hung-wen et al., Annual Review of Microbiology, 1994,48:223-56; Williams et al., “The Carbohydrates: Chemistry and Biology”Vol. 1B, 1980, 761-798; and Johnson et al., Current Opinion Chem. Biol.,1998, 5:642-9), and been studied in the erythromycin biosyntheticcluster (Summers et al., Microbiology, 1997, 143:3251-3262).Glycosyltransferases have been studied in a number of systems (Olano etal., Molecular Gen. Genetics, 1998, 3:299-308; Fernandez et al., Journalof Bacteriology, 1998, 18:4929-4937; and Wilson et al., Gene, 1998,214:95-100).

Polyketides are synthesized via a common mechanistic scheme thought tobe related to fatty acid synthesis. The cyclic lactone framework isprepared by a series of condensations involving small carboxylic acidresidues (acyl groups). Modifications of the structure, such asketoreduction, dehydration and enolylreduction, also occur during theprocessing. The synthesis is driven by a set of large multi-functionalpolypeptides, referred to as polyketide syntheses.

PCT Publication No. WO 93/13663 describes the organization of the geneencoding the polyketide synthase of Saccharapolyspora erythraea. Thegene is organized in modules, with each module effecting onecondensation step. The precise sequence of chain growth and theprocessing of the growing chain is determined by the genetic informationin each module. This

PCT publication describes an approach for synthesizing novel polyketidestructures by manipulating in several ways the DNA governing thebiosynthesis of the cyclic lactone framework. In order to adapt thismethodology to other polyketides, however, the DNA molecules directingthe biosynthetic processing must first be isolated.

Combinatorial biosynthesis with bacterial deoxy-sugar biosynthetic geneshas been demonstrated (Madduri et al., 1998, Nature Biotechnology,16:69-74) with the antitumor drug epirubicin (4′-epidoxorubicin)produced by Streptomyces peucetius. The heterologous sugar biosyntheticgenes avrE from Streptomyces avermitilis and eryBIV fromSaccharopolyspora were introduced into an S. peucetius dnmV mutantblocked in the biosynthesis of dausosamine, the deoxysugar component ofepirubicin. Product yields were enhanced with avrE complementationdemonstrating heterologous expression of sugar biosynthetic genes incombinatorial biosynthesis. Glucosylation of the glycopeptide antibioticvancomycin (Solenberg et al., Chem Biol, 1997, 4:195-202) demonstratedthat the heterologous glycosyltransferases gtfB and gtfE fromAmycolatopsis orientalis expressed in E. coli producedglycosyltransferase capable of adding glucose or xylose to thevancomycin heptapeptide. Additionally, expression of gyE fromAmycolatopsis orientalis in Streptomyces toyocaensis resulted inglucosylation of A47934, producing a novel antibiotic. Thus, clonedglycosyltransferases can be used to produce novel hybrid antibiotics byglycosylation. In order to adapt this methodology to other glycosylsynthetic genes or glycosyltransferases, however, the DNA moleculesdirecting the biosynthetic processing must first be isolated.

Orsellinic acid is synthesized by AviM, a Type I polyketide synthetasein Streptomyces viridochromogenes Tu57. An acytyl-CoA is used as the“starter” unit and three manonyl-CoAs are used as “extender” units forthe synthesis of orsellinic acid. AviM has been shown to synthesizeorsellinic acid by introduction of aviM into S. lividans TK24 (Gaisseret al., Journal of Bacteriology, 1997, 179:6271-6278). AviM has homologyto the Penicillium patulum Type I polyketide synthase for6-methylsalicylic acid (MSAS). The M. carbonacea EvrJ protein hashomology to both AviM and MSAS and contains polyketide synthetic activesite motifs resembling acyl carrier proteins, β-ketoacyl:ACPsynthetases, and acetyl-CoA/Malonyl-CoA:ACP acetyltransferases. ThusEvrJ contains motifs necessary for the condensation of malonyl extenderunits with the starter acetyl-CoA unit.

The M. carbonacea Evil protein has homology to DpsC from from S.peucetius ATCC 29050. Purified DpsC has been shown to use propionyl-CoAas substrate and to be acylated by propionyl-CoA at the Ser-118 residue(Bao et al., J. Bacteriol, 199, 181:4690-5). This has led to theproposal that DpsC is responsible for the choice of proponyl-CoA as thestarter acyl unit in the biosynthesis of daunorhubicin by acting as anβ-ketoacyl:acyl carrier protein (ACP) synthetase three (KSIII), andcatalyzes the first condensation of the propionate-starter unit withmalonyl-ACP. Thus EvrI may be responsible for specifying the choice ofacetyl-CoA as the starter acyl group in orsellinic acid biosynthesis andcondensation with the first malonyl extender unit. EvrI contains apossible Cys-127 acylation site to form the EvrI-Cys-S-acetyl moiety.This active Cys is similar to the active Cys found in the Streptomycesglaucescens FabH (KSIII) enzyme.

The success in cloning and manipulating biosynthetic pathways for theproducts mentioned above demonstrates a need in the art to isolate andharness the biosynthetic pathway for everninomicin. Moreover, there is aneed to employ everninomicin biosynthesis in the development of novelmolecules by combinatorial biosynthesis.

Genetic Manipulation of Actinomycetes

The ability to insert genes into the actinomycete chromosome isimportant to avoid plasmid inhibition of secondary metbolite productionand to allow the construction of recombinants that do not requireantibiotic selection to maintain cloned genes. Vectors have beendeveloped for use in actinomycetes that contain att/int functions forsite-specific integration of plasmid DNA. The two systems available makeuse of the att/int functions of bacteriophage phiC31 (U.S. Pat. No.5,190,870) and plasmid pSAM2 (U.S. Pat. No. 5,741,675). However, thereis a need for additional vectors with att/int functions forsite-specific integration in M. carbonacea.

The present invention addresses these and other needs in the art.

SUMMARY OF THE INVENTION

The present invention advantageously provides the DNA sequence for thegene cluster responsible for encoding everninomicin biosynthetic genes,which provide the machinery for producing everninomicin. As a result,the present invention provides the information needed to synthesizenovel everninomicin-related compounds based on everninomicin, arisingfrom modifications of this DNA sequence designed to change glycosyl andmodified orsellinic acid groups contained in everninomicin.

Thus, in one embodiment, the invention provides a nucleic acidcomprising an everninomicin biosynthetic pathway gene product from aMicromonospora carbonacea, e.g., encoding a protein as set forth inTables 1a and 1b, and in a specific aspect having a coding region (CDR)as set forth in Tables 1a and 1b.

The invention further provides expression vectors, host cells, andrelated methods of expression of protein gene products, comprising theisolated nucleic acids of the invention.

In addition, isolated polypeptides corresponding to an everninomicinbiosynthetic pathway gene product are provided. Specific open readingframes and amino acid sequences of the polypeptides are set forth inFIG. 11 (SEQ ID NOS: 2-175) and FIG. 12 (SEQ ID NOS: 183-204).

Furthermore, the invention provides modified M. carbonacea, in which aneverninomicin biosynthetic pathway gene is knocked-out, or,alternatively, over-expressed (or both). Similarly, the inventionprovides for metabolic engineering of new everninomicin analogs.

A particular advantage of this invention is the discovery of variouseverninomicin resistance genes, which can be used as selection markers.Thus, the invention provides a vector comprising an M. carbonaceaeverninomicin biosynthetic pathway resistance gene, and related methodsof selection of transfected or transformed host cells.

In a related but distinct aspect, the inventors have discovered aMicromonospora site-specific integrase. The gene for the integrase canbe incorporated in a vector for integration into any actinomycete, and,particularly Monospora. Thus, the invention further provides a methodfor introducing a heterologous gene into an actinomycete chromosomeusing this particular vector.

These and other aspects of the invention are better understood byreference to the following Detailed Description and Examples.

DESCRIPTION OF THE DRAWINGS

FIG. 1. The structure of everninomicin.

FIG. 2A-C. (A) Map of cosmid clones and subclones that span the wholeregion of the everninomicin biosynthetic locus and surrounding genomicDNA. Heavy cross-hatching indicates sequenced regions; lightcross-hatching indicates regions for which a cosmid restriction map wasobtained. (B) Restriction map of cosmid pSPRX272. (C) Restriction map ofcosmid pSPRX256. In (B) and (C), cross-hatched regions have beensequenced and cloned fragments are indicated by clone designationsbeneath the fragment.

FIGS. 3A-D. Map of the everninomicin biosynthetic region ofMicromonospora carbonacea var. africana DNA. Distances in bf are shownrelative to the beginning of the DNA region. Open reading frames (ORF)are indicated by block arrows. The restriction sites for BamHl, BglII,EcoRI, KpnI, PstI and XhoI restriction enzymes are indicated.

FIGS. 4A-B. Proposed biosynthetic pathway for orsellinic acid synthesisby evrJ and malonylCo-A synthesis by evbD. (A) Orsellinic acidbiosynthesis. (B) Malonyl-CoA biosynthesis.

FIG. 5A-B. Biosynthetic pathway for D-6-deoxysugar and L-6-deoxysugarbiosynthesis by evrV, evrW, and evrX.

FIG. 6. Map of pSPRH830B E. coli-Micromonospera shuttle vector.

FIG. 7A B. (A) Map of pSPRH840 integrating vector. (B)(1)-(4) Sequenceof integrase gene (SEQ ID NO: 176) and deduce amino acid (SEQ ID NO:177).

FIG. 8. Map of pSPRH826b insertion plasmid.

FIG. 9A-B. Analysis of M. carbonacea and M. halophytica pSPRH840insertion site att-B/attP region. (A)(1)-(2) Alignment of pMLP1 attPregion with religation clone edge sequences. (B) pMLP1 attP.

FIG. 10. Schematic of specific resistance gene-containing fragments forcloning in the pSPRH830 vector.

FIG. 11A(1)-(95). Everninomicin biosynthetic pathway locus sequence (SEQID NO:1) with open reading frames and deduced amino acid sequences (SEQID NOS: 2-175).

FIG. 12A-K. Everninomicin biosynthetic pathway locus sequence (SEQ IDNO: 182) with open reading frames and deduced amino acid sequences (SEQID NOS: 183-204).

DETAILED DESCRIPTION

Micromonospora carbonacea var. africana produces several antibiotics,including everninomicin, thiostrepton, chloramphenicol and lasilosid. Asnoted above, the present invention advantageously provides the DNAsequence for the gene cluster responsible for encoding everninomicinbiosynthetic genes, which provide the machinery for producingeverninomicin. As a result, the present invention provides theinformation needed to synthesize novel everninomicin-related compoundsbased on everninomicin, arising from modifications of this DNA sequencedesigned to change glycosyl and modified orsellinic acid groupscontained in everninomicin.

The invention also advantageously provides an M. carbonacea-specificintegrase gene and integration sites (see, FIGS. 7B, 9A, and 9B). Use ofthe pMLP1 att/int site specific integration function allows forincreasing a given gene dosage and for adding heterologous genes thatlead to the formation of new products, such as hybrid antibiotics. Thisprocedure has many advantages over methods involving autonomouslyreplicating plasmids. In particular, a plasmid containing pMLP1 att/intfunctions would integrate as a single copy per chromosome. Plasmidscomprising the site-specific integrating function would introduce thegene of choice into the chromosome of actinomycetes. Vectors lackingactinomycete origins of replication can only exist in their integratedform in actinomycetes. Integrated vectors are extremely stable whichallows the gene copies to be maintained without antibiotic selectivepressure. The site-specific nature of the integration allows analysis ofthe integrants.

“Everninomicin” refers to a lipophilic oligosaccharide antibiotic of theorthosomycin family of antibiotics, which contain at least one acidicphenolic hydrogen, and two orthoester linkages associated with theglycosy residues (FIG. 1; see, PCT Publication No. WO 93/07904). Theseinclude for example everninomicin, curamycin, avilamycin andflambamycins (Ganguly et al., J.C.S. Chemical Communication, 1976, pp.609-611; “Kirk-Othmer, Encyclopedia of Chemical Technology”, Vol 2,1978, Third Edition, John Wiley and Sons, pp. 205-209; Ollis, et al.,Tetrahedron, 1979, 35:105-127). These lipophilic oligosaccharideantibiotics exhibit broad spectrum biological activity against grampositive and some gram negative bacteria in various in vitro assays, andin vivo activity, for example, in animal models such as murine models ofgram positive infection.

An “everninomicin (EV) biosynthetic pathway gene product” from aMicronionospora carbonacea refers to any enzyme (“EV biosyntheticenzyme”) involved in the biosynthesis of everninomicin. These genes arelocated in the EV biosynthetic locus on the M. carbonacea chromosome.This locus is depicted in FIGS. 2A and 3. Since everninomicin is onlyknown to be produced in M. carbonacea, for the sake of particularity theEV biosynthetic pathway is associated with this microorganism. However,it should be understood that this term encompasses EV biosyntheticenzymes (and genes encoding such enzymes) isolated from any M.carbonacea, and furthermore that these genes may have novel homologuesin related actinomycete bacteria that fall within the scope of theclaims here. In specific embodiments, these genes are depicted in FIG.11 (SEQ ID NO:1; open reading frames and polypeptides designated as SEQID NOS: 2-175) and FIG. 12 (SEQ ID NO: 182; open reading frames andpolypeptides designated as SEQ ID NOS: 183-204). It is noted that thesequences of FIGS. 11 and 12 are linked (contiguous) or connected suchthat they are part of the same cluster, i.e., the sequence in FIG. 12precedes that of FIG. 11. Moreover, the present inventors haveidentified specific categories into which many of the genes from the EVbiosynthetic pathway fall, including but by no means limited to,orsellinic acid biosynthetic enzymes, sugar biosynthetic enzymes,glycosyltransferases, tailoring enzymes, regulatory enzymes(serine-threonine kinases), and resistance mechanism enzymes (rRNAmethylases and transporter enzymes). These categories are discussed ingreater detail, infra. The gene products are listed in Tables 1a and 1b.

TABLE 1b Gene Products and Putative Enzymatic Functions Involved inEverninomicin Production Enzymatic Function Gene (Protein ACC No; BLASTProduct CDS¹ RBS² SEQ ID NO.⁴ Score) Class evdA  (132 . . . 1382)* (1389 . . . 1394)* 2, 3 similarity to hydroxylase sugar length(CAA11782; 6.5e−137) biosynthetic 416aa evdB  (1490 . . . 2611)*  (2618. . . 2622)* 4, 5 hexose aminotransferase, sugar NH2 length dnrJ homologaddition 373aa (daunorubicin) (P25048; 2.8e−65) evdC  (2622 . . . 3860)* (3867 . . . 3870)* 6, 7 similar to flavoprotein, sugar length oxidasebiosynthetic 412aa (S39965; 4.4e−92) evdD (4143 . . . 5312) (4134 . . .4138) 8, 9 dNTP-hexose Glycosyl length glycosyltransferase transfer389aa (AAC01731; 4.6e−49) evdE (5309 . . . 6235) 10, 11 hexosedehydratase sugar length (CAA18814; 8.0e−58) biosynthetic 308aa evdF(6232 . . . 7275) (6226 . . . 6229) 12, 13 dNTP-hexose Glycosyl lengthglycosyltransferase transfer 347aa (CAB07092; 3.4e−18) evdG (7272 . . .8327) 14, 15 unknown unknown length 351aa evdH (8342 . . . 9364) (8333 .. . 8336) 16, 17 dNTP-hexose Glycosyl length glycosyltransferasetransfer 340aa (CAA19930; 0.8) evdI   (9463 . . . 10,224)*  (10,232 . .. 10,235)* 18, 19 hydrolase sugar length (AAB81835; 6.8e−10)biosynthetic 253aa evdJ (10,424 . . . 11,176) 20, 21 unknown unknownlength 250aa evdK (11,208 . . . 12,455) 22, 23 hexose dehydratase orsugar length empimerase biosynthetic 415aa (CAB08849; 3.3e−26) evdL (12,108 . . . 13,022)*  (13,027 . . . 13,030)* 24, 25 dNTP-hexoseGlycosyl length glycosyltransferase transfer 304aa (S37028; 0.010) evrA (14,410 . . . 15,363)*  (15,369 . . . 15,373)* 26, 27 hexose epimerasesugar length (CAA12010.1; 1.3e−40) biosynthetic 317aa evrB  (15,380 . .. 16,414)* 28, 29 hexose oxidoreductase sugar length (ACC01734; 1.3e−65)biosynthetic 344aa evrC  (16,419 . . . 17,873)* 30, 31 hexosedehydratase sugar length (CAA12009; 2.2e−107) biosynthetic 484aa evrD (17,870 . . . 18,934)* 32, 33 GDP-mannose 4,6- sugar length dehydratasebiosynthetic 354aa (BAA16585; 1.0e−88) evrE (19,374 . . . 20,906) 34, 35multidrug efflux resistance length transporter mechanism 510aa(CAB15277; 1.4e−59) evrF (21,064 . . . 22,542) (21,056 . . . 22,542) 36,37 similar to non-heme orsellinic length oxygenate/halogenase acidchlorine 492aa (CAA11780; 4.3e−58) addition evrG (22,748 . . . 24,172)(22,736 . . . 22,740) 38, 39 oxidase tailoring length (Q12737; 5.5e−67)474aa evrH  (24,177 . . . 25,223)*  (25,230 . . . 25,233)* 40, 41unknown unknown length (AAB89073; 3.2e−6) 348aa evrI (25,550 . . .26,626) 42, 43 acyl starter unit fidelity PKS acyl length (daunorubicinhomology) Carbon 358aa (AAA65208; 5.7e−56) choice evrJ (26,685 . . .30,479) (26,672 . . . 26,676) 44, 45 orsellinic acid synthase 6-polyketide length methylsalicilic acid synthetase 1264aa synthetase(CAA72713; 0.0e) evrK  (30,557 . . . 31,876)*  (31,885 . . . 31,888)*46, 47 Na/H antiporter unknown length (BAA16991; 2.1e−14) 439aa evrL (31,941 . . . 32,882)* 48, 49 similar to gene essential to unknownlength heme biosynthesis 313aa (BAA12681; 0.0012) evrM  (33,167 . . .34,405)*  (34,414 . . . 34,418)* 50, 51 similar to p450 tailoring lengthhydroxylase 412aa (S18530; 3.8e−70) evrN  (34,449 . . . 35,210)* (35,219 . . . 35,221)* 52, 53 methyl transferase tailoring length(CAB10751; 0.00061) 253aa evrO  (35,294 . . . 36,238)* 54, 55 unknownunknown length (BAA20094; 0.56) 314aa evrP  (36,235 . . . 36,963)* 56,57 unknown unknown length (CAB05421; 0.00020) 242aa evrQ  (36,998 . . .38,026)* 58, 59 similar to oxidoreductase tailoring length and heatstress protein 342aa (P80874; 7.8e−31) evrR  (38,072 . . . 38,566)* 60,61 low similarity to hexaheme regulatory length nitrite reductaseregulator (methyl 164aa (P30866; 0.0034) transferase) evrS  (38,892 . .. 40,163)* 62, 63 dNTP-hexose Glycosyl length glycosyltransferasetransfer 423aa (AAD15267; 1.9e−36) evrT  (40,216 . . . 40,890)*  (40,899. . . 40,902)* 64, 65 similar to L-proline tailoring length hydroxylase224aa (BAA 20094; 5.5e−7) evrU  (40,887 . . . 41,576)* 66, 67methyltransferase tailoring length (CAB02029; 5.6e−6) 229aa evrV (41,679 . . . 42,707)*  (42,714 . . . 42,717)* 68, 69 dTDP-glucoseepimerase L-dTDP- length (AAB84886; 3.5e−36) glucose 342aa biosyntheticevrW  (42,810 . . . 43,799)*  (43,807 . . . 43,811)* 70, 71 dTDP-glucosedehydratase D-dTDP- length (CAA72715; 5.1e−136) glucose 329aabiosynthetic (GDH) evrX  (43,799 . . . 44,866)* 72, 73 dTDP-glucosesynthetase D-dTDP- length (A26984; 1.2e−118) glucose 355aa biosyntheticevrY  (45,014 . . . 45,760)*  (45,767 . . . 45,770)* 74, 75 dehalogenasedrug length (P24069; 5.8e−8) resistance 248aa evrZ  (45,962 . . .46,714)*  (45,952 . . . 45,956)* 76, 77 similar to drug lengthmuramidase/lysozyme resistance 250aa (P25310; 1.2e−77) evsA  (47,156 . .. 49,234)* 78, 79 serine threonine kinase regulatory length (BAA32455;2.0e−76) 692aa evsB (51,627 . . . 52,715) (51,620 . . . 51,622) 80, 81similar to proteases unknown length 362aa evsC (52,889 . . . 53,557) 82,83 similar to MAF involved unknown length in septum formation 222aa(BAA18425; 1.3e−21) evbA (53,554 . . . 54,207) 84, 85 O-methyltransferase tailoring; length (AAC44130; 8.6e−38) possible 217aaresistance evbB  (54,362 . . . 55,117)*  (55,125 . . . 55,128)* 86, 87membrane pump, homolog resistance length mithramicin resistancemechanism 251aa (AAC443581; 2.9e−24) evbC  (55,135 . . . 56,094)* (56,100 . . . 56,103)* 88, 89 membrane pump, homolog resistance lengthmithramicin resistance mechanism 319aa (AAC44357; 1.0e−69) evbC2 (56,184 . . . 56,813)* 90, 91 ankrylin like resistance length(AAC44356; 0.0041) 198aa evbD (56,961 . . . 58,709) (56,947 . . .56,951) 92, 93 acyl-CoA carboxylase malonyl-CoA length (CAB07068;7.3e−201) biosynthesis 582aa evbE (58,873 . . . 60,312) 94, 95 IMPdehydrogenase tailoring length (CAA15452; 4.1e−165) 479aa evbF  (60,472. . . 61,029)*  (61,038 . . . 61,040)* 96, 97 hypothetical proteinregulator length Rv0653c, mycobacterium 185aa (CAB07128; 3.8e−06) evbF1(61,288 . . . 61,560) 98, 99 unknown unknown length 90aa evbF2 (61,610 .. . 62,069) (61,597 . . . 61,599) 100, 101 ORFI Streptomyces regulatory/length peucetius resistance 152aa (CAA06602; 0.024) evbG (62,122 . . .63,795) 102, 103 ABC transporter drug resistance length (Q11046;2.7e−170) 557aa evbH (63,891 . . . 65,828) (63,884 . . . 63,887) 104,105 ABC transporter drug resistance length (Q11047; 5.6e−166) 645aa evbI (66,469 . . . 67,872)*  (67,883 . . . 67,886)* 106, 107 lipoamidedehydrogenase tailoring length (CAA17075; 1.6e−140) 467aa evbJ (67,979 .. . 68,434) 108, 109 hypothetical protein unknown length Rv3304[Mycobacterium 151aa tuberculosis] (CAA17076; 7.6e−40) evbK (68,529 . .. 69,494) 110, 111 protease synthase and regulatory length sporulationregulator; 321aa homology to resistance proteins streptomyces (029729;7.3-7) evbL  (69,610 . . . 70,359)* 112, 113 acetyltransferase/tailoring length phosphotransferase 249aa evbM  (70,365 . . . 71,285)*114, 115 hypothetical protein Rv unknown length 1584c [Mycobacterium306aa tuberculosis] (CAB09085; 0.32) evbN  (71,289 . . . 71,918)* (71,926 . . . 71,929)* 116, 117 hypothetical protein unknown lengthSC3A7.08 [S. coelicolor] 209aa (CAA20071; 4.0e−40) evbO (72,284 . . .72,979) 118, 119 putative lipoprotein [S. coelicolor] unknown length(CAA19252; 2.6e−20) 230aa evbP  (72,933 . . . 74,195)* 120, 121peptidase unknown length (CAA17077; 6.5e−88) 420aa evbQ  (74,707 . . .76,290)* 122, 123 methylmalonyl-Coa acyl precursor length mutatebiosynthesis 527aa (BAA30410; 1.8e−149) evbR (76,622 . . . 78,712) 124,125 protein serine/threonine regulatory length kinase note eukaryotic696aa type (BAA32455; 1.1e−71) evbS (78,791 . . . 80,521) 126, 127phosphomannomutase sugar length (CAA17080; 5.4e−91) biosynthesis 576aaevbT (82,073 . . . 82,933) 128, 129 hypothetical protein 10-28 lengthSC5C7.22c 286aa (CAA20634; 5.7e−28) evbU  (83,280 . . . 83,888)* 130,131 glucose-6-phosphate 1- unknown length dehydrogenase low 202aa BLASThomology (S61167; 0.00039) evbV  (84,080 . . . 84,661)* 132, 133 uracilphosphoribosyl unknown length transferase 193aa (CAA17081; 5.6e−60) evbW (84,890 . . . 85,906)* 134, 135 deoxyribose-phosphate unknown lengthaldolase 338aa (AAA79343; 1.3e−54) evbX (85,909 . . . 87,342) 136, 137aldehyde dehydrogenase tailoring length (AAB84440; 4.2e−103) 477aa evbY(87,422 . . . 88,159) (87,407 . . . 87,411) 138, 139 aldehydedehydrogenase tailoring length (CAA71003; 3.4e−16) 245aa evbZ (88,292 .. . 88,705) (88,280 . . . 88,282) 140, 141 hypothetical protein unknownlength (CAB06141; 1.3e−16) 137aa evcA (88,716 . . . 89,621) 142, 143hypothetical protein, unknown length putative integral 301aa membraneprotein [Streptomyces coelicolor] (CAB06143; 4.5e−28) evcB (89,817 . . .91,067) 144, 145 cytochrome D oxidase tailoring length subunit I 416aa(P94364; 3.0e−65) evcC (91,078 . . . 92,085) (91,068 . . . 91,072) 146,147 cytochrome D oxidase tailoring length subunit II 335aa (CAA71118;1.9e−15) evcD (92,148 . . . 93,833) 148, 149 ABC transporter resistancelength (CAA22219; 2.6e−107) 561aa evcE (93,830 . . . 95,671) 150, 151ABC transporter resistance length (AAC44070; 3.4e−32) 613aa evcF (95,729. . . 96,418) 152, 153 unknown unknown length 229aa evcG  (96,440 . . .96,775)* 154, 155 unknown unknown length (AAB84787; 1.9e−8) 111aa evcH(96,894 . . . 97,805) 156, 157 unknown unknown length (CAA17083; 9.2e−5)303aa evcI  (98,287 . . . 100,362) 158, 159 unknown unknown search(CAA19992; 6.0e−6) length 691aa evcJ  (100,733 . . . 101,326)* 160, 161putative ATP/GTP binding unknown length protein 197aa (CAA19989;7.9e−59) evcJ2  (101,328 . . . 101,732)* 162, 163 unknown unknown length(CAA19986; 8.6e−23) 134aa evcK  (101,803 . . . 102,156)* 164, 165unknown unknown length (CAA19991; 1.7e−36) 117aa evcL  (102,204 . . .105,641)* 166, 167 unknown unknown search (CAA19992; 4.6e−99) length1145aa evcM (105,907 . . . 105,641) 168, 169 putitive uridine kinaseunknown length (CAA19591; 1.0e−9) 201aa evcN (106,513 . . . 107,589)170, 171 unknown unknown length (CAA17085; 7.5e−120) 358aa evrMR(107,653 . . . 108,615) (107,637 . . . 107,641) 172, 173 homology to 23SrRNA resistance length methylase for 320aa mycinamicin resistance (myrA)(BAA03674; 1.4e−79) evrMR2 (108,635 . . . 109,216) 174, 175 homology togene linked resistance length to myrA 193aa

TABLE 1b Gene Products and Putative Enzymatic Functions Involved inEverninomicin Production ORF1 length (189-1064)* (1069-1073) 183, 184Transcriptional regulator unknown 291aa Biotinylation H70979; 8e−31 ORF2length (1184-2767)* 185, 186 Propionyl-CoA carboxylase unknown 527aaT42208; 0.0e ORF3 length (2863-3753)* 187, 188 unknown unknown 296aaORF4 length (3776-4276)* (4280-4284) 189, 190 ECF sigma factor T36644;regulation 166aa 8e−26 ORF5 length (4526-5368)* 191, 192 Membraneprotein unknown 280aa CAB94598.1; 5e−50 ORF6 length (5392-6147)*(6152-6156) 193, 194 rRNA methyltransferase resistance 251aa AAG32067.1;4e−49 ORF7 length (6194-7282)* 195, 196 O-methyl transferasemodification 362aa PP42712; 4e−59 ORF8 length (7280-8133)  (8141-8145)197, 198 unknown unknown 284aa ORF9 length (8254-9318)  (9324-9328) 199,200 oxidoreductase modification 354aa AAG05128.1; 3e−51 ORF10 length (9575-10,504) (9568-9571) 201, 202 unknown unknown 309aa ORF11(10,584-11,585)  203, 204 deoxyhexose ketoreductase sugar Length 333aaT17473; 1e−49 modification Legend for Tables 1a and 1b *CDS, RBScomplement on full length biosynthetic locus sequence ¹CDS is thenputative coding sequence. ²RBS is the putative ribosome binding site.³GenBank protein database (http://www.ncbi.nih.gov/Entrez/protein.html)⁴The first number corresponds to the nucleotide sequence and the secondnumber corresponds to the amino acid sequence.

Although the term “enzymes” is used to refer to the EV biosyntheticpathway gene products, such gene products may be proteins withnon-enzymatic functions. Such proteins are also contemplated as fallingwithin the scope of the present invention.

An “EV biosynthetic pathway bottleneck gene” is a gene encoding aproduct whose level limits the rate of synthesis of everninomicin.Examples of such gene products include, though are not limited to, evrJ(involved in orsellinic acid biosynthesis); evrV, evrW, and evrX(involved in dTDP-glucose synthesis); evbD (involved inmalonyl-CoA-synthesis, which is required for orsellinic acid synthesis);and oxidases responsible for oxidation of the amino group on theterminal sugar to produce everninomicin that contains a nitrososugargroup. Other likely bottleneck genes include those encodingglycosyltransferases (evdD, evdF, evdH, evdL, and evrS) and tailoringenzymes, particularly sugar modification enzymes.

A modified Micromonospora carbonacea refers to a microorganisms that hasbeen genetically engineered to over-express or suppress expression of anEV biosynthetic pathway gene product (enzyme). Such genetic engineeringand manipulation is described in detail, infra. Preferably, to increasethe level of production of everninomicin, the modified microorganismoverexpresses one or more bottleneck genes. To produce an everninomicinanalog or homolog, various tailoring enzyme genes (e.g., evdB, a hexoseaminotransferase that produces an amino sugar; evrF, a nonhemehalogenase that chlorinates the orsinillic acid; or an oxidase gene thatproduces a nitrososugar by oxidation of an aminosugar) may be knockedout. Other knock-outs may be made of putative key genes, resulting inall likelihood in blockage of everninomicin biosynthesis. These includethe orsellinic acid synthase (evrJ), dTDP-glucose synthases (evrV, evrW,and evrX), and glycosyltransferases (evdD, evdF, evdH, evdL, and evrS).A knockout of the glycosyltransferase that adds the terminal glycosylgroup is expected to produce an everninomicin analog lacking theterminal glycosyl group.

Such genetic construction can be replicated in a different actinomycete,such as a Streptomyces, as described infra, by introduction of all orpart of the modified everninomicin biosynthetic pathway described hereinto such a host cell.

A Micromonospora carbonacea “everninomicin biosynthetic pathwayresistance gene product” is a protein or enzyme that confers resistanceto everninomicin (and related compounds) to a host cell. Expression ofsuch a gene on a vector provides an alternative selection mechanism fortransformed host cells in vitro or in vivo, and thus can be used inmolecular biological manipulations of cells independently of the EVbiosynthetic pathway. For example, such a vector can be used to selectfor a transfected or transformed host cell by culturing the cell in thepresence of an amount of everninomicin that is toxic to the host celllacking the vector.

A Micromonospora site-specific Att/Int functions consist of an integraseprotein and AttP site, e.g., as depicted in FIG. 7B (SEQ ID NO: 177) andin a specific embodiment encoded by a nucleic acid having a sequence asdepicted in FIG. 7B (SEQ ID NO: 176), that permits site-specificintegration of a vector into an actinomyce, and particularly aMicromonospera, genome.

General Definitions

As used herein, the term “isolated” means that the referenced materialis removed from the environment in which it is normally found. Thus, anisolated biological material can be free of cellular components, i.e.,components of the cells in which the material is found or produced. Inthe case of nucleic acid molecules, an isolated nucleic acid includes aPCR product, an isolated mRNA, a cDNA, or a restriction fragment. Inanother embodiment, an isolated nucleic acid is preferably excised fromthe chromosome in which it may be found, and more preferably is nolonger joined to non-regulatory, non-coding regions, or to other genes,located upstream or downstream of the gene contained by the isolatednucleic acid molecule when found in the chromosome. In yet anotherembodiment, the isolated nucleic acid lacks one or more introns.Isolated nucleic acid molecules include sequences inserted intoplasmids, cosmids, artificial chromosomes, and the like. Thus, in aspecific embodiment, a recombinant nucleic acid is an isolated nucleicacid. An isolated protein may be associated with other proteins ornucleic acids, or both, with which it associates in the cell, or withcellular membranes if it is a membrane-associated protein. An isolatedorganelle, cell, or tissue is removed from the anatomical site in whichit is found in an organism. An isolated material may be, but need notbe, purified.

The term “purified” as used herein refers to material that has beenisolated under conditions that reduce or eliminate the presence ofunrelated materials, i.e., contaminants, including native materials fromwhich the material is obtained. For example, a purified protein ispreferably substantially free of other proteins or nucleic acids withwhich it is associated in a cell; a purified nucleic acid molecule ispreferably substantially free of proteins or other unrelated nucleicacid molecules with which it can be found within a cell. As used herein,the term “substantially free” is used operationally, in the context ofanalytical testing of the material. Preferably, purified materialsubstantially free of contaminants is at least 50% pure; morepreferably, at least 90% pure, and more preferably still at least 99%pure. Purity can be evaluated by chromatography, gel electrophoresis,immunoassay, composition analysis, biological assay, and other methodsknown in the art.

Methods for purification are well-known in the art. For example, nucleicacids can be purified by precipitation, chromatography (includingpreparative solid phase chromatography, oligonucleotide hybridization,and triple helix chromatography), ultracentrifugation, and other means.Polypeptides and proteins can be purified by various methods including,without limitation, preparative disc-gel electrophoresis, isoelectricfocusing, HPLC, reversed-phase HPLC, gel filtration, ion exchange andpartition chromatography, precipitation and salting-out chromatography,extraction, and countercurrent distribution. For some purposes, it ispreferable to produce the polypeptide in a recombinant system in whichthe protein contains an additional sequence tag that facilitatespurification, such as, but not limited to, a polyhistidine sequence, ora sequence that specifically binds to an antibody, such as FLAG and GST.The polypeptide can then be purified from a crude lysate of the hostcell by chromatography on an appropriate solid-phase matrix.Alternatively, antibodies produced against the protein or againstpeptides derived therefrom can be used as purification reagents. Cellscan be purified by various techniques, including centrifugation, matrixseparation (e.g., nylon wool separation), panning and otherimmunoselection techniques, depletion (e.g., complement depletion ofcontaminating cells), and cell sorting (e.g., fluorescence activatedcell sorting [FACS]). Other purification methods are possible. Apurified material may contain less than about 50%, preferably less thanabout 75%, and most preferably less than about 90%, of the cellularcomponents with which it was originally associated. The “substantiallypure” indicates the highest degree of purity which can be achieved usingconventional purification techniques known in the art.

In a specific embodiment, the term “about” or “approximately” meanswithin 20%, preferably within 10%, and more preferably within 5% of agiven value or range. Alternatively, especially in biological systems,the term “about” means within about a log (i.e., an order of magnitude)preferably within a factor of two of a given value, depending on howquantitative the measurement.

The use of italics indicates a nucleic acid molecule (e.g., enrJ cDNA,gene, etc.); normal text indicates the polypeptide or protein.

“Sequence-conservative variants” of a polynucleotide sequence are thosein which a change of one or more nucleotides in a given codon positionresults in no alteration in the amino acid encoded at that position.

“Function-conservative variants” are those in which a given amino acidresidue in a protein or enzyme has been changed without altering theoverall conformation and function of the polypeptide, including, but notlimited to, replacement of an amino acid with one having similarproperties (such as, for example, polarity, hydrogen bonding potential,acidic, basic, hydrophobic, aromatic, and the like) Amino acids withsimilar properties are well known in the art. For example, arginine,histidine and lysine are hydrophilic-basic amino acids and may beinterchangeable. Similarly, isoleucine, a hydrophobic amino acid, may bereplaced with leucine, methionine or valine. Such changes are expectedto have little or no effect on the apparent molecular weight orisoelectric point of the protein or polypeptide. Amino acids other thanthose indicated as conserved may differ in a protein or enzyme so thatthe percent protein or amino acid sequence similarity between any twoproteins of similar function may vary and may be, for example, from 70%to 99% as determined according to an alignment scheme such as by theCluster Method, wherein similarity is based on the MEGALIGN algorithm. A“function-conservative variant” also includes a polypeptide or enzymewhich has at least 60% amino acid identity as determined by BLAST orFASTA algorithms, preferably at least 75%, most preferably at least 85%,and even more preferably at least 90%, and which has the same orsubstantially similar properties or functions as the native or parentprotein or enzyme to which it is compared.

The terms “mutant” and “mutation” mean any detectable change in geneticmaterial, e.g. DNA, or any process, mechanism, or result of such achange. This includes gene mutations, in which the structure (e.g. DNAsequence) of a gene is altered, any gene or DNA arising from anymutation process, and any expression product (e.g. protein or enzyme)expressed by a modified gene or DNA sequence. The term “variant” mayalso be used to indicate a modified or altered gene, DNA sequence,enzyme, cell, etc., i.e., any kind of mutant.

As used herein, the term “homologous” in all its grammatical forms andspelling variations refers to the relationship between proteins thatpossess a “common evolutionary origin,” including proteins fromsuperfamilies (e.g., the immunoglobulin superfamily) and homologousproteins from different species (e.g., myosin light chain, etc.) (Reecket al., Cell 50:667, 1987). Such proteins (and their encoding genes)have sequence homology, as reflected by their sequence similarity,whether in terms of percent similarity or the presence of specificresidues or motifs at conserved positions.

Accordingly, the term “sequence similarity” in all its grammatical formsrefers to the degree of identity or correspondence between nucleic acidor amino acid sequences of proteins that may or may not share a commonevolutionary origin (see Reeck et al., supra). However, in common usageand in the instant application, the term “homologous,” when modifiedwith an adverb such as “highly,” may refer to sequence similarity andmay or may not relate to a common evolutionary origin.

In a specific embodiment, two DNA sequences are “substantiallyhomologous” or “substantially similar” when the encoded polypeptides areat least 35-40% similar as determined by one of the algorithms disclosedherein, preferably at least about 60%, and most preferably at leastabout 90 or 95% in a highly conserved domain, or, for alleles, acrossthe entire amino acid sequence. Sequence comparison algorithms includeBLAST (BLAST P, BLAST N, BLAST X), FASTA, DNA Strider, the GCG (GeneticsComputer Group, Program Manual for the GCG Package, Version 7, Madison,Wis.) pileup program, etc. using the default parameters provided withthese algorithms. An example of such a sequence is an allelic or speciesvariant of the specific everninomicin biosynthetic genes of theinvention. Sequences that are substantially homologous can be identifiedby comparing the sequences using standard software available in sequencedata banks, or in a Southern hybridization experiment under, forexample, stringent conditions as defined for that particular system.

Cloning and Expression of EV Biosynthetic Pathway Genes

The present invention contemplates analysis and isolation, and/orconstruction, of a gene encoding a functional or mutant EV biosyntheticenzyme, including a full length, or naturally occurring form of an EVbiosynthetic enzyme, and any antigenic fragments thereof from anysource. It further contemplates expression of functional or mutant EVbiosynthetic enzyme protein for evaluation, diagnosis, or, particularly,biosynthesis of everninomicin or other secondary metabolic products.

In accordance with the present invention there may be employedconventional molecular biology, microbiology, and recombinant DNAtechniques within the skill of the art. Such techniques are explainedfully in the literature. See, e.g., Sambrook, Fritsch & Maniatis,Molecular Cloning: A Laboratory Manual, Second Edition (1989) ColdSpring Harbor Laboratory Press, Cold Spring Harbor, N.Y. (herein“Sambrook et al., 1989”); DNA Cloning: A Practical Approach, Volumes Iand II (D. N. Glover ed. 1985); Oligonucleotide Synthesis (M. J. Gaited. 1984); Nucleic Acid Hybridization [B. D. Hames & S. J. Higgins eds.(1985)]; Transcription And Translation [B. D. Hames & S. J. Higgins,eds. (1984)]; Animal Cell Culture [R. I. Freshney, ed. (1986)];Immobilized Cells And Enzymes [IRL Press, (1986)]; B. Ê Perbal, APractical Guide To Molecular Cloning (1984); F. M. Ausubel et al.(eds.), Current Protocols in Molecular Biology, John Wiley & Sons, Inc.(1994).

Molecular Biology—Definitions

“Amplification” of DNA, as used herein, denotes the use of polymerasechain reaction (PCR) to increase the concentration of a particular DNAsequence within a mixture of DNA sequences. For a description of PCR seeSaiki et al., Science, 239:487, 1988.

A “nucleic acid molecule” refers to the phosphate ester polymeric formof ribonucleosides (adenosine, guanosine, uridine or cytidine; “RNAmolecules”); or deoxyribonucleosides (deoxyadenosine, deoxyguanosine,deoxythymidine, or deoxycytidine; “DNA molecules”); or any phosphoesteranalogs thereof, such as phosphorothioates and thioesters, in eithersingle stranded form, or a double-stranded helix; or “protein nucleicacids” (PNA) formed by conjugating bases to an amino acid backbone; ornucleic acids containing modified bases, for example thiouracil,thio-guanine and fluoro-uracil. Double stranded DNA-DNA, DNA-RNA andRNA-RNA helices are possible. The term nucleic acid molecule, and inparticular DNA or RNA molecule, refers only to the primary and secondarystructure of the molecule, and does not limit it to any particulartertiary forms. Thus, this term includes double-stranded DNA found,inter alia, in linear (e.g., restriction fragments) or circular DNAmolecules, plasmids, and chromosomes. In discussing the structure ofparticular double-stranded DNA molecules, sequences may be describedherein according to the normal convention of giving only the sequence inthe 5′ to 3′ direction along the nontranscribed strand of DNA (i.e., thestrand having a sequence homologous to the mRNA). A “recombinant DNAmolecule” is a DNA molecule that has undergone a molecular biologicalmanipulation.

A “polynucleotide” or “nucleotide sequence” is a series of nucleotidebases (also called “nucleotides”) in DNA and RNA, and means any chain oftwo or more nucleotides. A nucleotide sequence typically carries geneticinformation, including the information used by cellular machinery tomake proteins and enzymes. These terms include double or single strandedgenomic and cDNA, RNA, any synthetic and genetically manipulatedpolynucleotide, and both sense and anti-sense polynucleotide (althoughonly sense stands are being represented herein). This includes single-and double-stranded molecules, i.e., DNA-DNA, DNA-RNA and RNA-RNAhybrids.

The polynucleotides herein may be flanked by natural regulatory(expression control) sequences, or may be associated with heterologoussequences, including promoters, internal ribosome entry sites (IRES) andother ribosome binding site sequences, enhancers, response elements,suppressors, signal sequences, polyadenylation sequences, introns, 5′-and 3′-non-coding regions, and the like. The nucleic acids may also bemodified by many means known in the art. Furthermore, thepolynucleotides herein may also be oligonucleotides modified with alabel capable of providing a detectable signal, either directly orindirectly. Exemplary labels include radioisotopes, fluorescentmolecules, biotin, and the like.

A “coding sequence” or a sequence “encoding” an expression product, suchas a RNA, polypeptide, protein, or enzyme, is a minimum nucleotidesequence that, when expressed, results in the production of that RNA,polypeptide, protein, or enzyme, i.e., the nucleotide sequence encodesan amino acid sequence for that polypeptide, protein or enzyme. A codingsequence for a protein may include a start codon (usually ATG, though asshown herein, alternative start codons can be used) and a stop codon.

The term “gene”, also called a “structural gene” means a DNA sequencethat codes for a particular sequence of amino acids, which comprise allor part of one or more proteins or enzymes, and may include regulatory(non-transcribed) DNA sequences, such as promoter sequences, whichdetermine for example the conditions under which the gene is expressed.The transcribed region of the gene may include untranslated regions,including a 5′-untranslated region (UTR) and 3′-UTR, as well as thecoding sequence.

A “promoter sequence” is a DNA regulatory region capable of binding RNApolymerase in a cell and initiating transcription of a downstream (3′direction) coding sequence. For purposes of defining the presentinvention, the promoter sequence is bounded at its 3′ terminus by thetranscription initiation site and extends upstream (5′ direction) toinclude the minimum number of bases or elements necessary to initiatetranscription at levels detectable above background. Within the promotersequence will be found a transcription initiation site (convenientlydefined for example, by mapping with nuclease S1), as well as proteinbinding domains (consensus sequences) responsible for the binding of RNApolymerase.

A coding sequence is “under the control of or “operably (or operatively)associated with” transcriptional and translational control sequences ina cell when RNA polymerase transcribes the coding sequence into mRNA,which is then trans-RNA spliced (if it contains introns) and translatedinto the protein encoded by the coding sequence.

The terms “express” and “expression” mean allowing or causing theinformation in a gene or DNA sequence to become manifest, for exampleproducing a protein by activating the cellular functions involved intranscription and translation of a corresponding gene or DNA sequence. ADNA sequence is expressed in or by a cell to form an “expressionproduct” such as mRNA or a protein. The expression product itself, e.g.the resulting mRNA or protein, may also be said to be “expressed” by thecell. An expression product can be characterized as intracellular,extracellular or secreted. The term “intracellular” means something thatis inside a cell. The term “extracellular” means something that isoutside a cell. A substance is “secreted” by a cell if it appears insignificant measure outside the cell, from somewhere on or inside thecell.

The term “transfection” means the introduction of a heterologous nucleicacid into a host cell. The term “transformation” means the introductionof a heterologous gene, DNA or RNA sequence to a host cell, so that thehost cell will express the introduced gene or sequence to produce adesired product. The introduced gene or sequence may also be called a“cloned” or “heterologous” gene or sequence, and may include regulatoryor control sequences, such as start, stop, promoter, signal, secretion,or other sequences used by a cell's genetic machinery. The gene orsequence may include nonfunctional sequences or sequences with no knownfunction. A host cell that receives and expresses introduced DNA or RNAhas been “transformed” and is a “transformant” or a “clone.” The DNA orRNA introduced to a host cell can come from any source, including cellsof the same genus or species as the host cell, or cells of a differentgenus or species.

The terms “vector”, “cloning vector” and “expression vector” mean thevehicle by which a DNA or RNA sequence (e.g. a foreign gene) can beintroduced into a host cell, so as to transform the host and promoteexpression (e.g. transcription and translation) of the introducedsequence. Vectors include plasmids, phages, viruses, etc.; they arediscussed in greater detail below.

Vectors typically comprise the DNA of a transmissible agent, into whichheterologous DNA is inserted. A common way to insert one segment of DNAinto another segment of DNA involves the use of enzymes calledrestriction enzymes that cleave DNA at specific sites (specific groupsof nucleotides) called restriction sites. A “cassette” refers to a DNAcoding sequence or segment of DNA that codes for an expression productthat can be inserted into a vector at defined restriction sites. Thecassette restriction sites are designed to ensure insertion of thecassette in the proper reading frame. Generally, foreign DNA is insertedat one or more restriction sites of the vector DNA, and then is carriedby the vector into a host cell along with the transmissible vector DNA.A segment or sequence of DNA having inserted or added DNA, such as anexpression vector, can also be called a “DNA construct.” A common typeof vector is a “plasmid”, which generally is a self-contained moleculeof double-stranded DNA, usually of bacterial origin, that can readilyaccept additional (foreign) DNA and which can readily introduced into asuitable host cell. A plasmid vector often contains coding DNA andpromoter DNA and has one or more restriction sites suitable forinserting foreign DNA. Promoter DNA is a DNA sequence which initiates,regulates, or otherwise mediates or controls the expression of thecoding DNA. Promoter DNA and coding DNA may be from the same gene orfrom different genes, and may be from the same or different organisms. Alarge number of vectors, including plasmid and fungal vectors, have beendescribed for replication and/or expression in a variety of eukaryoticand prokaryotic hosts. Non-limiting examples include pKK plasmids(Clonetech), pUC plasmids, pET plasmids (Novagen, Inc., Madison, Wis.),pRSET or pREP plasmids (Invitrogen, San Diego, Calif.), or pMAL plasmids(New England Biolabs, Beverly, Mass.), and many appropriate host cells,using methods disclosed or cited herein or otherwise known to thoseskilled in the relevant art. Recombinant cloning vectors will ofteninclude one or more replication systems for cloning or expression, oneor more markers for selection in the host, e.g. antibiotic resistance,and one or more expression cassettes.

The term “host cell” means any cell of any organism that is selected,modified, transformed, grown, or used or manipulated in any way, for theproduction of a substance by the cell, for example the expression by thecell of a gene, a DNA or RNA sequence, a protein or an enzyme. Hostcells can further be used for screening or other assays, as describedinfra. In a preferred aspect, a host cell of the invention is anactinomycete, preferably of the genus Streptomyces (e.g., a host cell asdescribed in Ziermann and Betlach, BioTechniques, 1999, 26:106) oralternatively Microinonospera. Additional examples include, but are notlimited to, the strains S. pristinaespiralis (ATCC 25486), S.antibioticus (DSM 40868), S. bikiniensis (ATCC 11062), S. parvulus (ATCC12434), S. glauescens (ETH 22794), S. actuosus (ATCC 25421), S.coelicolor (A3(2)), S. ambofaciens, S. lividans, S. griseofuscus, S.limosus, and the like (see also Smokvina et al., Proceedings,1:403-407).

The term “expression system” means a host cell and compatible vectorunder suitable conditions, e.g., for the expression of a protein codedfor by foreign DNA carried by the vector and introduced to the hostcell. Common expression systems include E. coli host cells and plasmidvectors, although the actinomycte host cell expression systems arepreferred for biosynthesis of everninomicin and related products.

The term “heterologous” refers to a combination of elements notnaturally occurring. For example, heterologous DNA refers to DNA notnaturally located in the cell, or in a chromosomal site of the cell. Aheterologous gene is a gene in which the regulatory control sequencesare not found naturally in association with the coding sequence. In thecontext of the present invention, an EV biosynthetic enzyme gene isheterologous to the vector DNA in which it is inserted for cloning orexpression, and it is heterologous to a host cell containing such avector, in which it is expressed, e.g., a K562 cell.

A nucleic acid molecule is “hybridizable” to another nucleic acidmolecule, such as a cDNA, genomic DNA, or RNA, when a single strandedform of the nucleic acid molecule can anneal to the other nucleic acidmolecule under the appropriate conditions of temperature and solutionionic strength (see Sambrook et al., supra). The conditions oftemperature and ionic strength determine the “stringency” of thehybridization. For preliminary screening for homologous nucleic acids,low stringency hybridization conditions, corresponding to a T_(m)(melting temperature) of 55° C., can be used, e.g., 5×SSC, 0.1% SDS,0.25% milk, and no formamide; or 30% formamide, 5×SSC, 0.5% SDS).Moderate stringency hybridization conditions correspond to a higherT_(m), e.g., 40% formamide, with 5× or 6×SCC. High stringencyhybridization conditions correspond to the highest T_(m), e.g., 50%formamide, 5× or 6×SCC. SCC is a 0.15M NaCl, 0.015M Na-citrate.Hybridization requires that the two nucleic acids contain complementarysequences, although depending on the stringency of the hybridization,mismatches between bases are possible. The appropriate stringency forhybridizing nucleic acids depends on the length of the nucleic acids andthe degree of complementation, variables well known in the art. Thegreater the degree of similarity or homology between two nucleotidesequences, the greater the value of T_(m) for hybrids of nucleic acidshaving those sequences. The relative stability (corresponding to higherT_(m)) of nucleic acid hybridizations decreases in the following order:RNA:RNA, DNA:RNA, DNA:DNA. For hybrids of greater than 100 nucleotidesin length, equations for calculating T_(m) have been derived (seeSambrook et al., supra, 9.50-9.51). For hybridization with shorternucleic acids, i.e., oligonucleotides, the position of mismatchesbecomes more important, and the length of the oligonucleotide determinesits specificity (see Sambrook et al., supra, 11.7-11.8). A minimumlength for a hybridizable nucleic acid is at least about 10 nucleotides;preferably at least about 15 nucleotides; and more preferably the lengthis at least about 20 nucleotides.

In a specific embodiment, the term “standard hybridization conditions”refers to a T_(m) of 55° C., and utilizes conditions as set forth above.In a preferred embodiment, the T_(m) is 60° C.; in a more preferredembodiment, the T_(m) is 65° C. In a specific embodiment, “highstringency” refers to hybridization and/or washing conditions at 68° C.in 0.2×SSC, at 42° C. in 50% formamide, 4×SSC, or under conditions thatafford levels of hybridization equivalent to those observed under eitherof these two conditions.

As used herein, the term “oligonucleotide” refers to a nucleic acid,generally of at least 10, preferably at least 15, and more preferably atleast 20 nucleotides, preferably no more than 100 nucleotides, that ishybridizable to a genomic DNA molecule, a cDNA molecule, or an mRNAmolecule encoding a gene, mRNA, cDNA, or other nucleic acid of interest.Oligonucleotides can be labeled, e.g., with ³²P-nucleotides ornucleotides to which a label, such as biotin, has been covalentlyconjugated. In one embodiment, a labeled oligonucleotide can be used asa probe to detect the presence of a nucleic acid. In another embodiment,oligonucleotides (one or both of which may be labeled) can be used asPCR primers, either for cloning full length or a fragment of EVbiosynthetic enzyme, or to detect the presence of nucleic acids encodingEV biosynthetic enzyme. In a further embodiment, an oligonucleotide ofthe invention can form a triple helix with a EV biosynthetic enzyme DNAmolecule. Generally, oligonucleotides are prepared synthetically,preferably on a nucleic acid synthesizer. Accordingly, oligonucleotidescan be prepared with non-naturally occurring phosphoester analog bonds,such as thioester bonds, etc.

EV Biosynthetic Pathway Nucleic Acids

A gene encoding EV biosynthetic enzyme can be isolated from anyeverninomicin-producing Micromonospora source. Methods for obtaining EVbiosynthetic enzyme gene are well known in the art, as described above(see, e.g., Sambrook et al., 1989, supra). The DNA may be obtained bystandard procedures known in the art from cloned DNA, by chemicalsynthesis, by cDNA cloning, or by the cloning of genomic DNA (e.g., DNAhaving a sequence as deposited with the ATCC and accorded accession no.39149), or fragments thereof, purified from the desired cell (see, forexample, Sambrook et al., 1989, supra; Glover, D. M. (ed.), 1985, DNACloning: A Practical Approach, MRL Press, Ltd., Oxford, U.K. Vol. I,II). Whatever the source, the gene can be molecularly cloned into asuitable vector for propagation of the gene. Identification of thespecific DNA fragment containing the desired EV biosynthetic enzyme genemay be accomplished in a number of ways. For example, a portion of an EVbiosynthetic enzyme gene exemplified infra can be purified and labeledto prepare a labeled probe, and the generated DNA may be screened bynucleic acid hybridization to the labeled probe (Benton and Davis,Science, 1977, 196:180; Grunstein and Hogness, Proc. Natl. Acad. Sci.U.S.A., 1975, 72:3961). Those DNA fragments with substantial homology tothe probe, such as an allelic variant from another species, willhybridize. In a specific embodiment, highest stringency hybridizationconditions are used to identify a homologous EV biosynthetic enzymegene.

Further selection can be carried out on the basis of the properties ofthe gene, e.g., if the gene encodes a protein product having theisoelectric, electrophoretic, amino acid composition, partial orcomplete amino acid sequence, antibody binding activity, or ligandbinding profile of EV biosynthetic enzyme protein as disclosed herein.Thus, the presence of the gene may be detected by assays based on thephysical, chemical, immunological, or functional properties of itsexpressed product.

Other DNA sequences which encode substantially the same amino acidsequence as an EV biosynthetic enzyme gene may be used in the practiceof the present invention. These include but are not limited to allelicvariants, species variants, sequence conservative variants, andfunctional variants.

The genes encoding EV biosynthetic enzyme derivatives and analogs of theinvention can be produced by various methods known in the art. Themanipulations which result in their production can occur at the gene orprotein level. For example, the cloned EV biosynthetic enzyme genesequence can be modified by any of numerous strategies known in the art(Sambrook et al., 1989, supra). The sequence can be cleaved atappropriate sites with restriction endonuclease(s), followed by furtherenzymatic modification if desired, isolated, and ligated in vitro. Inthe production of the gene encoding a derivative or analog of EVbiosynthetic enzyme, care should be taken to ensure that the modifiedgene remains within the same translational reading frame as the EVbiosynthetic enzyme gene, uninterrupted by translational stop signals,in the gene region where the desired activity is encoded, unless thegene will be used to knock-out or disrupt an endogenous EV biosyntheticenzyme.

Additionally, the EV biosynthetic enzyme-encoding nucleic acid sequencecan be mutated in vitro or in vivo, to create and/or destroytranslation, initiation, and/or termination sequences, or to createvariations in coding regions and/or form new restriction endonucleasesites or destroy preexisting ones, to facilitate further in vitromodification. Such modifications can also be made to introducerestriction sites and facilitate cloning the EV biosynthetic enzyme geneinto an expression vector. Any technique for mutagenesis known in theart can be used, including but not limited to, in vitro site-directedmutagenesis (Hutchinson, C., et al., J. Biol. Chem., 1978, 253:6551;Zoller and Smith, DNA, 1984, 3:479-488; Oliphant et al., Gene 1986,44:177; Hutchinson et al., Proc. Natl. Acad. Sci. U.S.A., 1986, 83:710),use of TAB″ linkers (Pharmacia), etc. PCR techniques are preferred forsite directed mutagenesis (see Higuchi, “Using PCR to Engineer DNA”, inPCR Technology: Principles and Applications for DNA Amplification, H.Erlich, ed., Stockton Press, 1989, Chapter 6, pp. 61-70).

The identified and isolated gene can then be inserted into anappropriate cloning vector. A large number of vector-host systems knownin the art may be used. Possible vectors include, but are not limitedto, plasmids or modified viruses, but the vector system must becompatible with the host cell used. Examples of vectors include, but arenot limited to, E. coli, bacteriophages such as lambda derivatives, orplasmids such as pBR322 derivatives or pUC plasmid derivatives, e.g.,pGEX vectors, pmal-c, pFLAG, etc. The insertion into a cloning vectorcan, for example, be accomplished by ligating the DNA fragment into acloning vector which has complementary cohesive termini. However, if thecomplementary restriction sites used to fragment the DNA are not presentin the cloning vector, the ends of the DNA molecules may beenzymatically modified. Alternatively, any site desired may be producedby ligating nucleotide sequences (linkers) onto the DNA termini; theseligated linkers may comprise specific chemically synthesizedoligonucleotides encoding restriction endonuclease recognitionsequences. Finally, the vector may include a fusion polypeptide sequencesuch that the construct with the EV biosynthetic enzyme encodes achimeric protein, such as a poly-histidine tag, FLAG tag, myc epitopetag, or some other such sequence for ease in purification.

Recombinant molecules can be introduced into host cells viatransformation, transfection, infection, electroporation, etc., so thatmany copies of the gene sequence are generated. Preferably, the clonedgene is contained on a shuttle vector plasmid, which provides forexpansion in a cloning cell, e.g., E. coli, and facile purification forsubsequent insertion into an appropriate expression cell line, if suchis desired.

Expression of EV Biosynthetic Enzyme Polypeptides

The nucleotide sequence coding for EV biosynthetic enzyme, or antigenicfragment, derivative or analog thereof, or a functionally activederivative, including a chimeric protein, thereof, can be inserted intoan appropriate expression vector, i.e., a vector which contains thenecessary elements for the transcription and translation of the insertedprotein-coding sequence. Thus, a nucleic acid encoding EV biosyntheticenzyme of the invention can be operationally associated with a promoterin an expression vector of the invention. Such vectors can be used toexpress functional or functionally inactivated EV biosynthetic enzymepolypeptides.

The necessary transcriptional and translational signals can be providedon a recombinant expression vector.

Expression of EV biosynthetic enzyme protein may be controlled by anypromoter/enhancer element known in the art, but these regulatoryelements must be functional in the host selected for expression.Promoters which may be used to control EV biosynthetic enzyme geneexpression include, but are not limited to, prokaryotic expressionvectors such as the β-lactamase promoter (Villa-Komaroff, et al., Proc.Natl. Acad. Sci. U.S.A., 1978, 75:3727-3731), or the tac promoter(DeBoer, et al., Proc. Natl. Acad. Sci. U.S.A., 1983, 80:21-25; see also“Useful proteins from recombinant bacteria” in Scientific American,242:74-94, 1980). Among regulable promoters which can be used in thecontext of the present invention, mention may be made more especially ofany regulable promoter which is functional in actinomycetes. These cancomprise promoters induced specifically by an agent introduced into tothe culture medium, such as, for example, the thiostrepton-induciblepromoter tipA (Murakami et al., J. Bact., 1989, 171:1459), orthermoinducible promoters such as that of the groEL genes, for example(Mazodier et al., J. Bact., 1991, 173:7382). They can also comprise anactinomycetes promoter which is specifically active in the late phasesof the proliferation cycle of actinomycetes, such as, for example,certain promoters of genes of the secondary metabolism (genes for theproduction of antibiotics, in particular).

Soluble forms of the protein can be obtained by collecting culturefluid, or solubilizing inclusion bodies, e.g., by treatment withdetergent, and if desired sonication or other mechanical processes, asdescribed above. The solubilized or soluble protein can be isolatedusing various techniques, such as polyacrylamide gel electrophoresis(PAGE), isoelectric focusing, 2-dimensional gel electrophoresis,chromatography (e.g., ion exchange, affinity, immunoaffinity, and sizingcolumn chromatography), centrifugation, differential solubility,immunoprecipitation, or by any other standard technique for thepurification of proteins.

Antibodies to EV Biosynthetic Enzymes

According to the invention, any EV biosynthetic enzyme polypeptideproduced recombinantly or by chemical synthesis, and fragments or otherderivatives or analogs thereof, including fusion proteins, may be usedas an immunogen to generate antibodies that recognize the EVbiosynthetic enzyme polypeptide. Such antibodies include but are notlimited to polyclonal, monoclonal, chimeric, single chain, Fabfragments, and an Fab expression library. The anti-EV biosyntheticenzyme antibodies of the invention may be cross reactive, e.g., they mayrecognize EV biosynthetic enzyme from different species. Polyclonalantibodies have greater likelihood of cross reactivity. Alternatively,an antibody of the invention may be specific for a single form of EVbiosynthetic enzyme, such as murine EV biosynthetic enzyme. Preferably,such an antibody is specific for human EV biosynthetic enzyme.

Various procedures known in the art may be used for the production ofpolyclonal antibodies to EV biosynthetic enzyme polypeptide orderivative or analog thereof. For the production of antibody, varioushost animals can be immunized by injection with the EV biosyntheticenzyme polypeptide, or a derivative (e.g., fragment or fusion protein)thereof, including but not limited to rabbits, mice, rats, sheep, goats,etc. In one embodiment, the EV biosynthetic enzyme polypeptide orfragment thereof can be conjugated to an immunogenic carrier, e.g.,bovine serum albumin (BSA) or keyhole limpet hemocyanin (KLH). Variousadjuvants may be used to increase the immunological response, dependingon the host species, including but not limited to Freund's (complete andincomplete), mineral gels such as aluminum hydroxide, surface activesubstances such as lysolecithin, pluronic polyols, polyanions, peptides,oil emulsions, keyhole limpet hemocyanins, dinitrophenol, andpotentially useful human adjuvants such as BCG (bacille Calmette-Guerin)and Corynebacterium parvum.

For preparation of monoclonal antibodies directed toward the EVbiosynthetic enzyme polypeptide, or fragment, analog, or derivativethereof, any technique that provides for the production of antibodymolecules by continuous cell lines in culture may be used. These includebut are not limited to the hybridoma technique originally developed byKohler and Milstein (Nature, 1975, 256:495-497), as well as the triomatechnique, the human B-cell hybridoma technique (Kozbor et al.,Immunology Today, 1983, 4:72; Cote et al., Proc. Natl. Acad. Sci.U.S.A., 1983, 80:2026-2030), and the EBV-hybridoma technique to producehuman monoclonal antibodies (Cole et al., in Monoclonal Antibodies andCancer Therapy, Alan R. Liss, Inc., 1985, pp. 77-96).

According to the invention, techniques described for the production ofsingle chain antibodies (U.S. Pat. Nos. 5,476,786 and 5,132,405 toHuston; U.S. Pat. No. 4,946,778) can be adapted to produce EVbiosynthetic enzyme polypeptide-specific single chain antibodies.Indeed, these genes can be delivered for expression in vivo. Anadditional embodiment of the invention utilizes the techniques describedfor the construction of Fab expression libraries (Huse et al., Science,1989, 246:1275-1281) to allow rapid and easy identification ofmonoclonal Fab fragments with the desired specificity for an EVbiosynthetic enzyme polypeptide, or its derivatives, or analogs.

Antibody fragments which contain the idiotype of the antibody moleculecan be generated by known techniques. For example, such fragmentsinclude but are not limited to: the F(ab′)₂ fragment which can beproduced by pepsin digestion of the antibody molecule; the Fab′fragments which can be generated by reducing the disulfide bridges ofthe F(ab′)₂ fragment, and the Fab fragments which can be generated bytreating the antibody molecule with papain and a reducing agent.

In the production of antibodies, screening for the desired antibody canbe accomplished by techniques known in the art, e.g., radioimmunoassay,ELISA (enzyme-linked immunosorbant assay), “sandwich” immunoassays,immunoradiometric assays, gel diffusion precipitin reactions,immunodiffusion assays, in situ immunoassays (using colloidal gold,enzyme or radioisotope labels, for example), western blots,precipitation reactions, agglutination assays (e.g., gel agglutinationassays, hemagglutination assays), complement fixation assays,immunofluorescence assays, protein A assays, and immunoelectrophoresisassays, etc. In one embodiment, antibody binding is detected bydetecting a label on the primary antibody. In another embodiment, theprimary antibody is detected by detecting binding of a secondaryantibody or reagent to the primary antibody. In a further embodiment,the secondary antibody is labeled. Many means are known in the art fordetecting binding in an immunoassay and are within the scope of thepresent invention. For example, to select antibodies which recognize aspecific epitope of an EV biosynthetic enzyme polypeptide, one may assaygenerated hybridomas for a product which binds to an EV biosyntheticenzyme polypeptide fragment containing such epitope. For selection of anantibody specific to an EV biosynthetic enzyme polypeptide from aparticular species of animal, one can select on the basis of positivebinding with EV biosynthetic enzyme polypeptide expressed by or isolatedfrom cells of that species of animal.

The foregoing antibodies can be used in methods known in the artrelating to the localization and activity of the EV biosynthetic enzymepolypeptide, e.g., for Western blotting, imaging EV biosynthetic enzymepolypeptide in situ, measuring levels thereof in appropriatephysiological samples, etc. using any of the detection techniquesmentioned above or known in the art.

In a specific embodiment, antibodies that agonize or antagonize theactivity of EV biosynthetic enzyme polypeptide can be generated. Suchantibodies can be tested using the assays described infra foridentifying ligands.

Techniques of isolating bacterial DNA are readily available and wellknown in the art. Any such techniques can be employed in this invention.In particular DNA from these deposited cultures can be isolated asfollows. Lyophils of E. coli XL1-Blue/pSPRX272, E. coliXL1-Blue/pSPRX2262, E. coli XL1-Blue/pSPR192, E. coli XL1-Blue/pSPRX210or E. coli XL1-Blue/pSPRX256 are plated onto L-agar (10 g tryptone, 10 gNaCl, 5 g yeast extract, and 15 g agar per liter) plates containing 100μg/ml ampicillin to obtain a single colony isolate of the strain. Thiscolony is used to inoculate about 500 ml of L-broth (10 g tryptone, 10 gNaCl, 5 g yeast extract per liter) containing 100 μg/ml apramycin, andthe resulting culture is incubated at 37° C. with aeration until thecells reach stationary phase. Cosmid DNA can be obtained from the cellsin accordance with procedures known in the art (see, e.g., Rao et al.,Methods in Enzymology, 1987, 153:166).

DNA of the current invention can be sequenced using any known techniquesin the art such as the dideoxynucleotide chain-termination method(Sanger et al., Proc. Natl. Acad. Sci., 1977, 74:5463) with eitherradioisotopic or fluorescent labels. Double-stranded, supercoiled DNAcan be used directly for templates in sequence reactions withsequence-specific oligonucleotide primers. Alternatively, fragments canbe used to prepare libraries of either random, overlapping sequences inthe bacteriophage M13 or nested, overlapping deletions in a plasmidvector. Individual recombinant DNA subclones are then sequenced withvector-specific oligonucleotide primers. Radioactive reaction productsare electrophoresed on denaturing polyacrylamide gels and analyzed byautoradiography.

Fluorescently labeled reaction products are electrophoresed and analyzedon Applied Biosystems (ABI Division, Perkin Elmer, Foster City, Calif.94404) model 370A and 373A or Dupont (Wilmington, Del.) Genesis DNAsequencers. Sequence data are assembled and edited using Genetic CenterGroup (GCG, Madison, Wis.) programs GelAssemble and Seqed or the ABImodel 670 Inherit Sequence Analysis system and the AutoAssembler andSeqEd programs.

Polypeptides corresponding to a domain, a submodule, a module, asynthesis unit (SU), or an open reading frame can be produced bytransforming a host cell such as bacteria, yeast, or eukaryoticcell-expression system with the cDNA sequence in a recombinant DNAvector. It is well within one skilled in the art to choose among hostcells and numerous recombinant DNA expression vectors to practice theinstant invention. Multifunctional polypeptides of polyketideeverninomicin synthase can be extracted from everninomicin-producingbacteria such as Streptomyces ambofaciens or translated in a cell-freein vitro translation system. In addition, the techniques of syntheticchemistry can be employed to synthesize some of the polypeptidesmentioned above.

Procedures and techniques for isolation and purification of proteinsproduced in recombinant host cells are known in the art. See, forexample, Roberts et al., Eur. J. Biochem., 1993, 214: 305-311 andCaffrey et al., FEBS, 1992, 304:225-228 for detailed description ofpolyketide synthase purification in bacteria. To achieve a homogeneouspreparation of a polypeptide, proteins in the crude cell extract can beseparated by size and/or charge through different columns well known inthe art once or several times. In particular the crude cell extract canbe applied to various cellulose columns commercially available such asDEAE-cellulose columns. Subsequently the bound proteins can be elutedand the fractions can be tested for the presence of the polyketideeverninomicin synthase or engineered derivative protein. Techniques fordetecting the target protein are readily available in the art. Any suchtechniques can be employed for this invention.

In particular the fractions can be analyzed on Western blot usingantibodies raised against a portion or portions of such polyketideeverninomicin synthase proteins. The fractions containing the polyketideeverninomicin synthase protein can be pooled and further purified bypassing through more columns well known in the art such as applying thepooled fractions to a gel filtration column. When visualized on SDS-PAGEgels homogeneous preparations contain a single band and aresubstantially free of other proteins.

Actinomycetes are prolific producers of secondary metabolites withantimicrobial and antifungal activity and represent a significant sourceof active compounds for pharmaceuticals. The genus Streptomyces producesa wide variety of secondary metabolites including antitumor, antifungal,and antimicrobial agents. The biosynthesis of these compounds has beenshown to be directed by large multi-functional proteins or a number ofproteins each catalyzing specific steps in the biosynthesis of thesecondary metabolite (REF-Biotechnology of AB etc.) The genes encodingactinomycete secondary metabolite biosynthesis have been found to beclustered on contiguous segments of each producing organisms genome(Strohl, William R., 1997, Biotechnology of Antibiotics, 2^(nd) Ed.,Marcel Dekker, Inc., New York, N.Y.). This makes it feasible forcomplete pathways to be cloned, analyzed, genetically manipulated andexpressed in surrogate hosts.

Components of the Everninomicin Biosynthethic Pathway Orsellinic AcidBiosynthesis

The term “polyketide” refers to a class of molecules produced throughthe successive condensation of small carboxylic acids. This diversegroup includes plant flavonoids, fungal aflatoxins, and hundreds ofcompounds of different structures that exhibit antibacterial,antifungal, antitumor, and anthelmintic properties. Some polyketidesproduced by fungi and bacteria are associated with sporulation or otherdevelopmental pathways; others do not yet have an ascribed function.Some polyketides have more than one pharmacological effect. Thediversity of polyketide structures reflects the wide variety of theirbiological properties. Many cyclized polyketides undergo glycosidationat one or more sites, and virtually all are modified during theirsynthesis through hydroxylation, reduction, epoxidation, etc.

For the purposes of the present invention, “polyketide” refers to theorsellenic acid moiety in everninomicin. Thus, the invention provides,in particular, the DNA sequence encoding the polyketide synthaseresponsible for biosynthesis of this orsellinic acid moiety ofeverninomicin, i.e., the everninomicin orsellinic acid synthetase. Theeverninomicin orsellinic acid synthase DNA sequence, which defines theorsellinic synthase gene cluster, directs biosynthesis of the orsellinicacid polyketide by encoding the various distinct activities oforsellinic synthase. The skilled artisan recognizes, however, that theeverninomicin orsellinic synthase genes are useful in the production ofother polyketides, e.g., by recapitulating all or part of this componentof the biosynthetic pathway, or by modulating biosynthetic pathways(see, the discussion about combinatorial biosynthesis, infra).

The gene cluster for orsellinic synthase, like other Type I polyketidebiosynthetic synthase genes whose organization has been elucidated, ischaracterized by the presence of an ORF encoding a multi-functionalprotein which contains separate, active sites for condensation of acylgroups as defined above. The map of the orsellinic synthase gene derivedfrom Micromonospora carbonacea var. africana is shown in FIG. 3. Theaccompanying synthetic pathway and the specific carboxylic acidsubstrates that are used for each condensation of orsellinic acidsynthesis are indicated in FIG. 4.

Polyketides are complex secondary metabolites synthesized from thecondensation of acetyl-coenzyme A (CoA) or related acyl-CoAs bypolyketide synthetase enzymes. Other acyl groups forming the acyl-CoAinclude malonyl, proponyl, and butyryl. Condensation of extender unitsrequires the action of β-ketoacyl ACP synthetase, acetyltransferase andacyl carrier protein enzymatic sites. Each module processes onecondensation step and typically requires several activities accomplishedby several active sites including acyl carrier protein (ACP),β-ketosynthase (KS), and acyltransferase (AT). The specific geneproducts identified with orsellinic biosynthesis are listed in Table 2.

TABLE 2 Orsellinic Acid Biosynthetic Gene Products Gene Product CDS SEQID No. Enzymatic Function evrF 21,064 . . . 22,542 36, 37 non-hemeoxygenase/halogenase addition evrI 25,550 . . . 26,626 42, 43 acylstarter unit evrJ 26,685 . . . 30,479 44, 45 Orsellinic acid synthase/6-methylsalicilic acid synthase evbD 56,961 . . . 58,709 92, 93 acyl-CoAcarboxylase evbQ  74,707 . . . 76,290* 122, 123 Methylmalonyl-CoA mutase

Polyketide synthetases are classified as either iterative Type I,iterative Type II or modular polyketide sythetases. Iterative Type Isynthetases resemble the multifunctional fatty acid synthases fromanimals and are composed of multifunctional proteins with separateprotein domains encoding each active sites. This is exemplified by theactinomycete S. erythrea polyketide synthetase for the biosynthesis oferythromycin, the Streptomyces viridochromogenes Tu57 AviM synthesis oforsellinic acid and the Penicillium patulum polyketide synthase for6-methylsalicylic acid (Hutchinson et al., Annual Review ofMicrobiology, 1995, 49:201-238; Gaisser et al., Journal of Bacteriology,1997, 179:6271-6278; Beck et al., European Journal of Biochemistry,1990, 192:487-498). Iterative type II synthetases have seperate proteinsfor each active site. These are exemplified by the polyketidesynthetases from S. coelicolor, S. violaceoruber and S. glaucescenssynthesizing the aromatic polyketides actinorhodin, granaticin andtetracenomycin respectively (Hopwood, et al., Annual Review ofMicrobiology 1990, 24:37-66). The modular polyketide synthetases arelarge proteins that contain several domains with each domain containingseveral active sites. An example of a modular polyketide synthetase isthe 6-deoxyerythronolide B synthetase from Saccharopolyspora erythraea.Recent reviews of polyketides and polyketide synthetases elaborate onthese pathways (Hopwood, et al., Annual Review of Microbiology, 1990,24:37-66; Hutchinson et al., Annual Review of Microbiology, 1995,49:201-238).

Although not wishing to be bound to any particular theory or technicalexplanation, a sequence similarity exists among domain boundaries invarious polyketide synthase genes. Thus, one skilled in the art is ableto predict the domain boundaries of newly discovered polyketide synthasegenes based on the sequence information of known polyketide synthasegenes. In particular, the boundaries of submodules, domains, and openreading frames in the instant application are predicted based onsequence information disclosed in this application and the locations ofthe domain boundaries of the everninomicin synthase (Donadio et al.,GENE, 1992, 111:51-60). Furthermore, the genetic organization of theeverninomicin synthase gene cluster appears to correspond to the orderof the reactions required to complete synthesis of everninomicin. Thismeans that the polyketide synthase DNA sequence can be manipulated togenerate predictable alterations in the final everninomicin product.

Acyl Precursor Formation

EvrJ (orsellinic acid synthetase) requires one acetyl-CoA starter andthree malonyl-CoA extender units to synthesize orsellinic acid. Theacetyl-CoA and malonyl-CoA units most likely are derived from glycolysisand fatty acid biosynthesis (Tang L, et al., Ann. NY Acad. Sci., 1994,721:105-16). The malonyl-CoA can also be derived from acetyl-CoA bycarboxylation by acetylCoA carboxylase, (Scott Eagleson, ConciseEncyclopedia of Biochemistry, 2^(nd) Ed., Walter de Gruyler; Berlin,1988). The M. carbonacea EV region contains an evbD which has stronghomology to know acetyl-CoA carboxylases. Thus evbD is responsible forthe synthesis of the malonyl-CoA unit required for orsellinic acidbiosynthesisas shown in FIG. 4.

Sugar Biosynthetic Products and Glycosyltransferases

Glycosyl groups (6-deoxysugars) are synthesized by a common mechanisminvolving hexose-1-P nucleotidyl-transferase, dTDP-D-glucose synthetaseand dTDP-D-glucose 4,6-dehydratase. L-deoxysugars are synthesized by theaction of a NDP-4-keto-6-deoxyhexose 3,5-epimerase. Deoxysugars can bemodified by deoxygenations, transaminations, methylations andisomerization or epimerizations prior to covalent attachment by aglycosytransferase.

Biosynthesis of the sugars (see Liu and Thorson, Annu. Rev. Microbiol.,1994, 48:223) that are attached to the orsellinic acid/polyketide, andthe enzymes that mediate attachment of the sugars, are also key elementsof the everninomicin biosynthetic pathway. Genes encoding such sugarbiosynthetic enzymes and glycosyltransferases are typically found in thebiosynthetic pathway locus (see Summers et al., Microbiology, 1997,143:3251). The genes identified from the EV biosynthetic locus arelisted in Tables 3 and 4.

TABLE 3 Sugar Biosynthetic Gene Products Gene Product CDS SEQ ID No.Enzymatic Function evdA  132 . . . 1382* 2, 3 Hydroxylase evdB 1490 . .. 2611* 4, 5 hexose aminotransferase evdC 2622 . . . 3860* 6, 7 oxidase(flavoprotein) evdE 5309 . . . 6235  10, 11 hexose dehydratase evdI  9463 . . . 10,224* 18, 19 Hydrolase evdK 11,208 . . . 12,455  22, 23hexose dehydratase or epimerase evrA 14,410 . . . 15,363* 26, 27 hexoseepimerase evrB 15,380 . . . 16,414* 28, 29 hexose oxidoreductase evrC16,419 . . . 17,873* 30, 31 hexose dehydratase evrD 17,870 . . . 18,934*32, 33 GDP-mannose 4,6-dehydratase evrV 41,679 . . . 42,707* 68, 69dTDP-glucose epimerase evrW 42,810 . . . 43,799* 70, 71 dTDP-glucosedehydratase evrX 43,799 . . . 44,866  72, 73 dTDP-glucose synthetaseevbS 78,791 . . . 80,521  126, 127 Phosphomannomutase evbU 83,280 . . .83,888  130, 131 Glucose-6-phosphate 1- dehydrogenase ORF9 8254 . . .9318  199, 200 Oxidoreductase ORF11 10,584 . . . 11,585  203, 204Deoxyhexose ketoreductase

TABLE 4 Glycosyltransferases Gene Product CDS SEQ ID No. EnzymaticFunction evdD 4143 . . . 5312 8, 9 DNTP-hexose glycosyltransferase evdF6232 . . . 7275 12, 13 DNTP-hexose glycosyltransferase evdH 8342 . . .9364 16, 17 DNTP-hexose glycosyltransferase evdL  12,108 . . . 13,022*24, 25 DNTP-hexose glycosyltransferase evrS  38,892 . . . 40,163* 62, 63DNTP-hexose glycosyltransferase

These genes are important targets for modulation. They are likely to bebottleneck genes, and thus increased expression using an exogenous orintegrating vector can increase the yield of everninomicin (or itsanalog). Alternatively, knocking out these genes may result in completeelimination of everninomicin biosynthesis.

Tailoring Enzymes

Various types of EV biosynthetic enzymes fall into the tailoring enyzmecategory. These are listed in Table 5. Increasing or decreasingexpression of these enzymes permits production of everninomicin analogs.Moreover, expression of these enzymes in other actinomycetes permitsproduction of novel secondary metabolites by the action of theeverninomicin tailoring enzymes on these metabolites.

TABLE 5 Tailoring Gene Products Gene Product CDS SEQ ID No. EnzymaticFunction evrG 22,748 . . . 24,172 38, 39 oxidase evrL  31,941 . . .32,882* 48, 49 heme biosynthesis evrM  33,167 . . . 34,405* 50, 51 p450hydroxylase evrN  34,449 . . . 35,210* 52, 53 methyl transferase evrQ 36,998 . . . 38,026* 58, 59 oxidoreductase/heat stress protein evrT40,216 . . . 40,890 64, 65 L-proline hydroxylase evrU 40,887 . . .41,576 66, 67 methyltransferase evbA 53,554 . . . 54,207 84, 85o-methyltransferase evbE 58,873 . . . 60,312 94, 95 IMP dehydrogenaseevbI  66,469 . . . 67,872* 106, 107 lipoamide dehydrogenase evbL  69,610. . . 70,359* 112, 113 acetyltransferase/ phosphotransferase evbX 85,909. . . 87,342 136, 137 aldehyde dehydrogenase evbY 87,422 . . . 88,59138, 139 aldehyde dehydrogenase evcB 89,817 . . . 91,067 144, 145cytochrome D oxidase subunit I evcC 91,078 . . . 92,085 146, 147cytochrome D oxidase subunit II

Regulatory Products: Serine-Threonine Kinases

Protein serine (Ser), threonine (Thr), and tyrosine (Tyr) kinases playessential roles in signal transduction in organisms ranging from yeastto mammals, where they regulate a diverse cellular activities. Genesthat encode eukaryotic-type protein kinases have also been identified indifferent bacterial species, suggesting that such enzymes are alsowidespread in prokaryotes. Although many of them have yet to be fullycharacterized, several studies indicate that eukaryotic-type proteinkinases play important roles in regulating cellular activities of thesebacteria, such as cell differentiation and secondary metabolism(Cheng-Cai, Molecular Microbiology, 1996, 20:9-15). Examples that havebeen studied include the pknD Ser/Thr kinase from Anabaena sp. PCC7120,which is involved in nitrogen metabolism control (Zhang et al.,Molecular and General Genetics, 1998, 258:26-33); the pkn9 Ser/Thrkinase from Myxococcus xanthus, which is involved in development offruiting bodies (Hanlon et al., Molecular Microbiology, 1997,23:459-71); and the afsK Ser/Thr kinase from Streptomyces coelicolor,which is involved in aerial myceliaum formation (Ueda et al., Gene,1996, 169:91-95). These genes from the EV biosynthetic locus are listedin Table 6.

TABLE 6 Regulatory Gene Products Gene Product CDS SEQ ID No. EnzymaticFunction evrR 38,072 . . . 38,566  60, 61 hexaheme nitrite reductaseregulator/ methyltransferase evsA 47,156 . . . 49,234* 78, 79serine-threonine kinase evbF 60,472 . . . 61,029* 96, 97 evbF2 61,610 .. . 62,069  100, 101 evbK 68,529 . . . 69,494* 110, 111 proteasesynthase/ sporulation regulator evbR 76,622 . . . 78,712  124, 125protein serine-threonine kinase (eukaryotic type) evcJ 100,733 . . .101,326* 160, 161 ATP/GTP binding protein ORF1  189 . . . 1064* 183, 184Transcriptional regulator biotinylation ORF4 3776 . . . 4276* 189, 190ECF sigma factor

The evsA and evbR proteins within the everninomicin cluster have a highdegree of homology to Ser/Thr kinases and may play a role in regulatingthe expression of the pathway. Manipulation of the evsA and evbRproteins could enhance the expression and yield of everninomicin from M.carbonacea by providing positive signals for biosynthesis. Thus, thesegenes are preferred elements in a vector to enhance the efficiency ofeverninomicin biosynthesis.

Resistance Mechanisms

Actinomycetes utilize a variety or mechanisms to confer resistance tosecondary metabolites they produce. These include membrane pumps, rRNAmethylases, O-phosphorylation, N-acetylation, and production ofresistant target proteins (Cundliffe, Annual Review of Microbiology,1989, 43:207-33). The genes from the EV biosynthetic locus that havethis function are listed in Table 7.

TABLE 7 Resistance Mechanism Genes Enzymatic Gene Product CDS SEQ ID No.Function evrE 19,374 . . . 20,906 34, 35 multidrug eflux transporterevrY  45,014 . . . 45,760* 74, 75 dehalogenase evrZ  45,962 . . .46,714* 76, 77 muramidase/ lysozyme evbB  54,362 . . . 55,117* 86, 87membrane pump evbC  55,135 . . . 56,094* 88, 89 membrane pump evbC2 56,184 . . . 56,813* 90, 91 ankrylin-like evbG 62,122 . . . 63,795 102,103 ABC transporter evbH 63,891 . . . 65,828 104, 105 ABC transporterevcD 92,148 . . . 93,833 148, 149 ABC transporter evcE 93,830 . . .95,671 150, 151 ABC transporter evrMR 107,653 . . . 108,615 172, 173 23SrRNA methylase evrMR2 108,635 . . . 109,216 174, 175 ORF6  5392 . . .6147* 193, 194 rRNA methyltransferase

Multi-drug transporters are membrane proteins that are able to expel abroad range of toxic molecules from the microbial cells. These multidrugtransporters belong to the ATP-binding cassette (ABC) family oftransport proteins that utilize the energy of ATP hydrolysis foractivity. In microorganisms, multidrug transporters play an importantrole in conferring antibiotic resistance on pathogens, and inactinomycetes confer resistance to the antibiotic secondary metabolitesproduced by these organisms themselves (Fath et al., Microbial Reviews,1993, 57:995-1017). A second class of membrane transporters that arefound in actinomycetes include MDR (multiple drug resistance) type pumpsfound in eukaryotes (Guilfoile et al., Proc. Natl. Acad. Sci. USA, 1991,88:8553-8557). The EV cluster contains evbB and evbC, which arehomologouse to the ATP-binding cassette (ABC) family of transportproteins and specifically to the mithramycin resistance pump fromStreptomyces argillaceus (Fernandez et al., Molecular and GeneralGenetics, 1996, 251:692-698). In addition the EV cluster contains evrE,an MDR type pump with homology to the Streptomyces peucetius drrA MDRtype pump that confers resistance to daunorubicin. Ribosomal methylaseshave also been found to confer resistance to producing organisms. ThetlrB 23S rRNA methylase from Streptomyces fradiae and the myrA 23S rRNAmethylase from Micromonospora griseorubida have been found to conferresistance to tylosin and mycinamicin respectively.

The EV cluster also contains evrMR, a 23 RNA methylase with (loc.)homology to both tlrB and myrA.

The EV pathway also contains evrZ, a gene with homology to muramidases.Muramidases (lysozyme) cleave β1,4 linkages between N-acetylglucosamineand N-acetylmuramic acid (Scott and Eagleson, Concise EncyclopediaBiochemistry, 2^(nd) Ed., Walter de Gruyter: Berlin, 1988 p. 353). Thus,evrZ may inactivate everninomicin by cleavage within the glycosyl bonds.

Increased levels of expression of one or more of these resistance genesis expected to enhance the efficiency of everninomicin biosynthesis inan enhanced biosynthetic system by reducing toxicity to the host cell.

Furthermore, these resistance genes are good candidates for use aspositive selection markers in recombinant systems. By including aneverninomicin resistance gene in a vector, a host cell successfullytransformed with the vector will demonstrate everninomicin resistance.Thus, everninomicin becomes a useful tool for selecting transformed hostcells.

Biosynthetic Production and Modification of Everninomicins

There are a number of uses for the cloned Micromonospora carboonacea EVcluster DNA. The cloned genes can be used to improve the yields ofeverninomicins and to produce novel everninomicins. Improved yields canbe obtained by introduction of a second copies of genes for enzymes thatare rate limiting in the pathway (“bottleneck genes”). This can beaccomplished by cloning genes onto vectors, preferably integratingvectors, then obtaining integrants in the chromosome. Alternatively, arate limiting enzyme gene can be modified by associating it with astrongly expressing promoter sequence and then integrating thisconstruct into the chromosome. Manipulation of regulatory proteinsincluding the Ser/Thr kinases can enhance yields by obtaining mutantsthat express EV pathway genes at higher levels than parental organisms.

Novel everninomicins can be produced by using cloned fragments todisrupt steps in the biosynthesis of everninomicin. Disruptions can leadto the accumulation of precursors or “shunt” products. To generatedisruptions, DNA fragments of internal segments of genes (lacking 5′ and3′ sequences) can be cloned into insertion vectors. These constructs canbe introduced into the parental organism and homologous recombinantsselected for that result in two copies of the gene in the chromosome.One copy lacks 3′ sequences and the second copy lacks upstream nativepromoter sequences and 5′ sequences. Alternatively, DNA fragments ofgenes containing internal deletions or insertions can be cloned intogene replacement vectors. Recombinants can be obtained that containinternal deletions or insertions of genes, which results in anon-functional chromosome copy of the gene. Constructs that allow afrequency of recombination into the chromosome to obtain disruptionsshould contain fragments of sufficient size for recombination to occur(300 to 600 bases). Modified everninomicins produced by disrupting thegenes may be antibiotics themselves, or serve as substrates for furtherchemical modification, creating new semi-synthetic everninomicins withunique properties or spectra of activity.

Novel everninomicins can also be produced by mutagenesis of the clonedgenes, and replacement of the mutated genes for their unmutatedcounterparts in the everninomicin producer. Mutagenesis may involve, forexample, (1) manipulation of the orsellinic acid PKS TypeI gene byintroduction of KR, DH or ER domains (see, Donidio el al., 1993), e.g.,to yield a modified orsellenic acid nucleus; (2) manipulation of theglycosyltranferase to relax substrate or glycosyl specificity, e.g., toyield everninomicin containing novel glycosyl groups or additionalglycosyl groups; and/or (3) manipulation of glycosyl biosynthetic genes,e.g., to yield novel glycosyl groups and everninomicin containing novelglycosyl groups.

The DNA from the everninomicin biosynthetic cluster can be used as ahybridization probe to identify homologous sequences. Thus, the DNAcloned here could be used to obtain uncloned regions flanking the regiondescribed here but not yet isolated. In addition DNA from the regioncloned here may be useful in identification of non-identical but similarsequences in other organisms.

The modified strains provided by the invention may be cultivated toprovide everninomicins using conventional protocols.

Genetic Manipulation of Actinomycetes

Protocols have been developed to genetically manipulate actinomycetegenomes and biosynthetic pathways. These include E. coli actinomyceteshuttle vectors, gene replacement systems, transformation protocols,transposon mutagenesis, insertional mutagenesis, integration systems andheterologous host expression. These techniques are reviewed in numerousarticles (Baltz et al., Trends Microbiol., 1998, 2:76-83, Hopwood etal., Genetic Manipulation of Streptomyces: A Laboratory Manual, 1985;Wohlleben et al., Acta Microbiol. Immunol. Hung, 1994, 41:381-9[Review]).

The development of vectors for the genetic manipulation of actinomycetesbegan with the observation of plasmids in actinomycetes and thedevelopment of a transformation protocol of actinomycete protoplastsusing polyethylene glycol (Bibb et al., Nature, 1980, 284:526-31). Manystandard molecular techniques for Streptomyces were developed by Hopwoodand colleages for Streptomyces coelicolor and Streptomyces lividans(Hopwood et al., Genetic Manipulation of Streptomyces: A LaboratoryManual, 1985). These techniques have been adapted and expanded to otheractinomycetes.

Vectors incorporating antibiotic-resistance markers (AmR, ThR, SpR) thatfunction in Streptomyces spp. and other features have allowed thedevelopment of vectors for (a) integration via homologous recombinationbetween cloned DNA and the Streptomyces spp. chromosome, (b) autonomousreplication, and (c) site-specific integration at the bacteriophagephiC31 attachment (att) site or pSAM2 attachment site, and (d) genereplacement vectors. Homologous recombination between the cloned DNA andthe chromosome can be used to make insertional knockouts of specificgenes. Autonomously replicating plasmids and integrating plasmids can beused to introduce heterologous genes into actinomycetes forcomplementation or expression studies.

Many actinomycetes contain restriction systems that limit the ability totransform organisms by protoplast transformation. More recent genetransfer procedures have been developed for introducing DNA intostreptomycetes by conjugation from Escherichia coli. This employs asimple mating procedure for the conjugal transfer of vectors from E.coli to Streptomyces spp. that involves plating of the donor strain andeither germinated spores or mycelial fragments of the recipient strain.Conjugal plasmids contain the 760-bp oriT fragment from the IncPplasmid, RK2 and are transferred by supplying transfer functions intrans by the E. coli donor strain. Other recent developments thatincrease the frequency of recombination of non-replicating plasmids intothe recipient actinomycete chromosome include transformation ofnon-replicating plasmids into protoplasts using denatured plasmid DNA(Oh and Chater, J. Bacteriol., 1997, 179:122-7) and conjugation ofnon-replicating plasmids from a methyl minus strain of E. coli. (Smithet al., FEMS Microbiol. Lett., 1997, 155:223-9).

Various strategies have been used to obtain gene replacements instreptomycetes, for the construction of mutations and the modificationof biosynthetic pathways (Baltz et al., 1998, supra; Hopwood et al.,supra; Wohllenben et al., 1994, supra; Baltz and Hosted, TIBTECH, 1996,14:245; Baltz, Curr. Op. Biotech., 1990, 1:12-20). These methods havetypically employed a two or three step procedure that results in allelicexchange. Initial crossover events between a non-intergrating phage,non-replicating plasmid, or temperature sensitive plasmid and thestreptomycete chromosome are selected for by antibiotic resistance.Subsequent recombination events that result in gene replacement can bedetected by screening the progeny of the initial recombinants by PCRanalysis, Southern analysis, appearance of an expected phenotype orscreening for the loss of a resistance marker which had previously beenexchanged into the loci to be replaced. The last of these methods hasbeen employed by Khosla et al., Mol. Microbiol., 1992, 6:3237-49; Khoslaet al., J. Bacteriol., 1993, 175:2197-204, to successfully modify thepolyketide biosynthetic route of S. coelicolor. The strategy employed byKhosla et al., 1992, supra, also has the advantage of allowing placementof non-selectable and phenotypically silent alleles into chosenpositions of the chromosome. Donadio et al., Proc. Natl. Acad. Sci.U.S.A., 1993, 90:7119-23 has also successfully reprogrammed theerythromycin pathway of Saccharopolyspora erythrae by gene replacement.

Non-replicating plasmids for gene replacement were initialy utilized byHilleman et al., Nucleic Acids Res., 1991, 19:727-31, who used aderivative of pDH5 to construct mutations in the phosphinothricintripeptide biosynthetic pathway of S. hygroscopicus. Plasmid-integrationevents were obtained by thiostrepton selection, subsequent screening ofthe primary recombinants indicated that 4 of 100 isolates had undergonea double-crossover gene replacement.

Use of counterselectable or negative selection markers such as rpsL(confers streptomycin sensitivity) or sacB (confers sucrose sensitivity)have been widely employed in other microorganisms for selection ofrecombination that results in gene replacement. In S. coelicolor,Buttner utilized glk as a counterselectable marker in att minus phiC31phage to select for recombination events to construct gene replacementmutants of three S. coelicolor RNA polymerase sigma factors (Buttner etaL, J. Bacteriol., 1990, 172:3367-78). Hosted has developed a genereplacement system utilizing the rpsL gene for counterselection (Hostedand Baltz, J. Bacteriol., 1997, 179:180-6).

The construction of recombinant streptomycete strains to produce hybridsecondary metabolites has been accomplished. Current procedures userecombinant DNA techniques to isolate and manipulate secondary metabolicpathways and to express these pathways in surrogate hosts such asStreptomyces lividans. Heterologous expression of diverse pathways,polyketide, oligopeptide and β-lactam biosynthetic pathways, has beenachieved. Furthermore novel polyketide structures have been generatedthrough the manipulation of polyketide genes forming chimeric pathways.Recently novel polyketide modules have been isolated from environmentalsources using PCR amplification and expressed in Streptomyces to yieldnovel chemical structures (Strohl et al., J. Industr. Microbiol., 1991,7:163; Kim et al., J. Bacteriol., 1995,77:1202; Ylihonko et al.,Microbiology, 1996, 142:1965).

Knowledge of the everninomicin synthase DNA sequence, its geneticorganization, and the activities associated with particular open readingframes, modules, and submodules of the gene enables production of noveleverninomicins that are not otherwise available. Modifications may bemade to the DNA sequence that either alter the structure or sequence ofaddition of building blocks. The principles have already been describedabove. In addition, any product resulting from post-transcriptional orpost-translational modification in vivo or in vitro based on the DNAsequence information disclosed here are meant to be encompassed by thepresent invention.

Combinatorial Biosynthesis

The EV biosynthetic enzymes described here are ideal candidates forcombinatorial biosynthesis to generate libraries of orthomycins,particularly everninomicin analogs and homologs, for testing and drugdiscovery (see Altreuter and Clark, Curr. Op. Biotech., 1999, 10:130;Reynolds, Proc. Natl. Acad. Sci. USA, 1998, 95:112744). Moreover, unlikechemical synthesis, which may depend on the efficiency of a specificreaction to determine product yield, a biosynthetic system can beamplified and propagated to produce high yields of the desired product.

Actinomycetes are well known microbial biosynthetic factories, and havebeen modified to produce novel compounds by mutation of specificbiosynthetic genes (see Hutchinson, Bio/Technology, 1994, 12:375;Piepersberg, Crit. Rev. Biotech., 1994, 14:251). In addition tomutagenisis in situ, rapid evolution by DNA shuffling, particularly withrelated genes from other species or from the EV biosynthetic locusitself, provides for more directed evolutionary mutagenesis (Stemmer,Nature, 1994, 370:389). This technique can be practiced, for example, byshuffling EV biosynthetic gene products with their closest homologs, asdetermined by BLAST (or some other homology algorithm) analysis. Forexample, gene shifting of two or more transferases can yield new enzymeswith altered function. Similarly, sugar biosynthetic genes, orsellinicacid biosynthetic genes, and tailoring genes can be manipulated by thetechniques of directed evolution, e.g., gene shuffling, to producemutants with novel enzymatic and synthetic function. Tailoring enzymesare particularly attractive targets for mutagenesis, since these willnot affect synthesis of the core structure, but yield a variety of novelproducts.

An Integration Vector for Micromonospera

In a specific embodiment, the present invention relates to a new nucleicacid sequence, to vectors for its expression and to its use infermentation processes in actinomycetes. This nucleic acid sequenceencodes a Micromonospera, and particularly M. carbonacea, var. africana,att/int functions and thus permits development of an integrating vector.In a specific embodiment, the att/int functions has an amino acidsequence as depicted in SEQ ID NO: 177. In a more specific embodiment,the integrase is encoded by a nucleic acid having a nucleotide sequenceas depicted in SEQ ID NO: 176 (FIG. 7B). A preferred integrating plasmidis shown in FIG. 7A.

Advantageously, the integrative vectors derived from this novelintegrase also comprise a recombinant DNA sequence coding for a desiredproduct, including but by no means limited to an EV biosynthetic gene.The product can be a peptide, polypeptide or protein of pharmaceuticalor agri-foodstuffs importance. In this case, the system of the inventionmakes it possible to increase the copy number of this sequence per cell,and hence to increase the levels of production of this product and thusto increase the yields of the preparation process. The desired productcan also be a peptide, polypeptide or protein participating in thebiosynthesis (synthesis, degradation, transport or regulation) of ametabolite by the actinomycete strain in question. In this case, thesystem of the invention makes it possible to increase the copy number ofthis sequence per cell, and hence to increase the levels of productionof this product, and thus either to increase the levels of production ofthe metabolite, or to block the biosynthesis of the metabolite, or toproduce derivatives of the metabolite.

Plasmids comprising the site-specific integrating function of theinvention can be used to permanently integrate copies of a heterologousgene of choice into the chromosome of many different hosts. The vectorscan transform these hosts at a very high efficiency. Because the vectorsdo not have actinomycete origins of replication, the plasmids cannotexist as autonomously replicating vectors in actinomycete hosts. Theplasmids only exist in their integrated form in these hosts. Theintegrated form is extremely stable which allows the gene copies to bemaintained without antibiotic selective pressure. The result is highlybeneficial in terms of cost, efficiency, and stability of thefermentation process.

Those skilled in the art will readily recognize that the variety ofvectors which can be created that comprise this fragment is virtuallylimitless. The only absolute requirement is that the plasmid comprise anorigin of replication which functions in the host cell in whichconstructions are made, such as E. coli or Bacillus. No actinomyceteorigin of replication is required. In fact, in a specific embodiment theplasmid comprising the inetegrase comprises no actinomycete origin ofreplication. Other features, such as an antibiotic resistance gene, amultiple cloning site and cos site, are useful but not required. Adescription of the generation and uses of cosmid shuttle vectors can befound in Rao et al., (Methods in Enzymology, 1987, 153:166-198). Inshort, any plasmid comprising the integrase is within the scope of thisinvention.

The integrating vectors can be used to integrate genes which increasethe yield of known products or generate novel products, such as hybridantibiotics or other novel secondary metabolites. The vector can also beused to integrate antibiotic resistance genes into strains in order tocarry out bioconversions with compounds to which the strain is normallysensitive. The resulting transformed hosts and methods of making theantibiotics are within the scope of the present invention.

The integrase of the invention may thus be used in any actinomycete, inthe genome of which the vector of the invention or its derivativesare iscapable of integrating. In particular, they may be used in fermentationprocesses involving strains of Streptomyces, of mycobacteria, ofbacilli, and the like. As an example, there may be mentioned the strainsS. pristinaespiralis (ATCC 25486), S. antibioticus (DSM 40868), S.bikiniensis (ATCC 11062), S. parvulus (ATCC 12434), S. glauescens (ETH22794), S. actuosus (ATCC 25421), S. coelicolor (A3(2)), S. ambofaciens,S. lividans, S. griseofuscus, S. limosus, and the like (see also,Smokvina et al., Proceedings, 1:403-407).

In this connection, European Patent Publication No. EP 350,341 describesvectors derived from plasmid pSAM2 having very advantageous properties.These vectors are capable of integrating in a site-specific manner inthe genome of actinomycetes, and possess a broad host range and highstability. Moreover, they may be used for transferring nucleic acidsinto actinomycetes and expressing these nucleic acids therein. U.S. Pat.No. 5,741,675 describes tools capable of improving the conditions ofindustrial use of the vectors derived from pSAM2 by increasing the copynumber of pSAM2 or its derivatives, since the free forms are present ina high copy number per cell. This patent also describes cassettes forthe expression of this gene, vectors containing it and their use forinducing the appearance of free copies of pSAM2 or integrative vectorsderived from the latter.

Alternatively, U.S. Pat. No. 5,190,871 provides methods for increasing agiven gene dosage and for adding heterologous genes that lead to theformation of new products such as hybrid antibiotics using plasmidscomprising the site-specific integrating function of phage phi.C31.

EXAMPLES

The following examples are provided for illustration purposes only andare not intended to limit the scope of the invention, which has beendescribed in broad terms above.

Example 1 Sequencing of Orsellinic Acid Synthetase

The DNA sequence of the Micromonospora carbonaceae var. africana (ATCC39149) everninomicin biosynthetic region was obtained by sequencinginserts of recombinant DNA subclones containing contiguous oroverlapping DNA segments of the region indicated in FIG. 2A. Allsequences representing the everninomicin region were fully contained inthe overlapping cosmid clones pSPRX272, pSPRX262, pSPR192, pSPRX210, andpSPRX256 (FIG. 2A). The sequence was obtained by subcloning andsequencing fragments bounded by restriction site as indicated in FIG.2A.

Preliminary sequences were also obtained for the cosmids pSPRX272 andpSPRX256. Restriction maps for these two cosmids are shown in FIGS. 2Band 2C, respectively. These restriction maps are characteristic of thesetwo isolated cosmid clones of the M. carbonaceae everninomicinbiosynthetic pathway or flanking regions thereof.

In order to obtain the evrJ gene, the sequence can be obtained bysubcloning and sequencing of the fragments bounded by the KpnI sites atposition 1, 25.9 kb, 29.6 kb, and 34.2 kb. The sequence can also beobtained by subcloning and sequencing of the fragments bounded by theBamHI sites at position 1, 24.5 kb, 27.0 kb, 28.8 kb and 30.5 kb. Theresulting fragments should be ligated and cloned in an appropriaterecombinant DNA vector. Clones containing the correct orientation of thefragment can be identified by restriction enzyme site mapping.

Example 2 Transformation of M. Carbonacea with pSPRH830

M. carbonacea was transformed with pSPRH830b (FIG. 6) by conjugationfrom E. coli S17-1 (Mazodier et al., Journal of Bacteriology, 1989,6:3583-3585) to M. carbonacea. E. coli S17-1 containing pSPRH830b wasgrown overnight at 37° C. in LB supplemented with 100 μg/ml Ampicillin(Amp). The culture was inoculated into LB containing 100 μg/ml Amp at an1:50 ratio and grown with shaking at 37° C. to an OD₆₀₀ of 0.4 to 0.5.Cells were harvested by centrifugation and washed three times with freshLB lacking Amp. M. carbonacea was grown in TSB medium at 30° C. withshaking to stationary phase. E. coli S17-1 containing pSPRH830b preparedas described above was mixed with M. carbonacea in a total volume of 100μl and plated on AS1 plates using a plastic hockey spreader. Plates wereincubated for 15 hours at 29° C. and then overlaid with 50 μg/mlnaladixic acid and 200 μg/ml Hygromycin for selection. Transconjugantsappearing in 2-3 weeks were picked, homogenized and grown in TSB mediawith 50 μg/mlnaladixic acid and 200 μg/ml hygromycin. Presence ofpSPRH830b in M. carbonacea transformants was confirmed by PCR analysisand isolation of pSPRH830b from exconjugats.

The ability to transform M. carbonacea with pSPRH830b (on a multicopyplasmid) allows the introduction of second copies of genes contained inthe everninomicin biosynthetic pathway or heterologous or mutated genesinto M. carbonacea.

Example 3 Transformation of M. Carbonacea with pSPRH840

The pSPRH840 integrating vector (FIG. 7A) was constructed as follows. A4.0 kb KpnII fragment from the pSPR150 cosmid containing the M.carbonacea pMLP1 intM gene was ligated with BamHI cleaved pBluescriptII(Stratagene) to yield pSPRH819. Sequence analysis of the 4.0 kb KpnIfragment from the cosmid revealed the presence of an integrase genedesignated intM, an excisionase gene designated xis, and an integraseattachment site designated attP (FIG. 7B).

BLAST analysis of intM showed homology to other integrases in the NRRLdatabase. Analysis of the predicted attP site showed homology to theattP sites found phage phiC31 and plasmid pSAM2.

A 2.5 kb NruI to XhoI fragment from pSPR819 was treated with T4polymerase to generate blunt DNA ends, alkaline phosphatase treated andligated into the pCRTopo 2.1 vector (Invitrogen Corp, Carlsbad Calif.)to yield pSPRH853. A 2.6 kb KpnI to PstI fragment from pSPRH853 wasligated to KpnI and PstI digested pSPR826b (FIG. 8) to yield pSPRH840(FIG. 7A). pSPRH840 was transformed into M. carbonacea SCC1413 and M.halophitica SCC760 as described in Example 2. Transconjugants appearingin two to three weeks were picked, homogenized, and grown in TSB mediumsupplimented with 50 μg/ml naladixic acid (Nacl) and 200 μg/mlHygromycin. DNA was prepared from transconjugants, cleaved with BamHI,separated by gel electrophoresis, a Southern blot prepared, and probedwith radiolabled pSPR826b. Southern hybridization analysis confirmed thepresence of pSPR826b sequences integrated into the M. carbonacea and M.halophitica chromosomes. Regions including pSPRH840 and chromosomalflanking sequences were cloned by digesting chromosomal DNA with PstI orKpnI, ligating digested DNA and transforming E. coli XL10 (Stratagene,La Jolla, Calif.). E. coli transformants were isolated, plasmid DNAprepared and analyzed by digestion and gel electrophoresis. TheattB/attP regions M. carbonacea and M. halophitica were each sequenced.Sequence analysis of this region confirmed that pSPRH840 had integratedinto the M. carbonacea chromosome, specifically into a tRNA region(FIGS. 9A and 9B).

The ability to transform M. carbonacea with pSPRH840 allows the highfrequency integration of second copies of genes contained in theeverninomicin biosynthetic pathway or heterologous or mutated genes intoM. carbonacea.

Example 4 Overexpression and Isolation of Proteins from the EV Region

The coding region of evrF gene was amplified with PCR primers:

5′ PR 657 (SEQ ID NO: 178) CCC TCG AGA TGT CCA GCA AGA TCC TA; 3′ PR 658(SEQ ID NO: 179) CGA ATT CTC AGG CAG ACT GCT CTG; and 5′ PR 659:(SEQ ID NO: 180) CCC TCG AGA ATG TCC AGC AAG ATC CTA; 3′ PR 660:(SEQ ID NO: 181) CGA ATT CAG ACT GCT CTG CCG CCG C;using the Advantage-GC Genomic PCR kit and Advantage HF polymerase(Clontech, Palo Alto, Calif.) and a Perkin-Elmer 9600 PCR machine(Foster City, Calif.). The 1.5 kb PCR products were digested with XhoIand EcoRI and the fragments were ligated to XhoI and EcoRI digestedpBADHisA (primer pair PR657/PR658 product) and pBADMycHisC (primer pairPR659/PR660 product) and transformed into E. coli Top10 (Stratagene,LaJolla, Calif.). Transformants were analyzed by plasmid isolationfollowed by digestion and gel electrophoresis analysis. Appropriateclones were also verified by sequence analysis. This yielded the evrFexpression clones pSPRE59 (pBADHisA) and pSPRE19 (pBADMycHisC). Top10cells containing either pSPRE59 and pSPRE19 were grown overnight at 37°C. with shaking in LB containing 50 ug/ml AMP. Overnight cultures wereused to innoculate fresh LB containing 50 μg/ml and grown at 37° C. withshaking to an OD₆₀₀ of 0.4 to 0.5. L-arabinose was added to a finalconcentration of 0.02% and the culture was incubated for an additional 4hours. Cells were collected by centrifugation, resuspended in 100 μlTris-Glycine buffer and boiled for five minutes. Whole cell proteinlysate was loaded onto a SDS-PAGE gel, electrophoresed, and stained withcoomassie blue to determine protein expression.

To isolate sufficient amounts of protein for raising antibodies, 100 mlof culture was processed as described above and the His-tagged EvrFprotein was purified by Ni-NTA column chromatography using the XpressProtein Purification System (Invitrogen, Carlsbad, Calif.). Therecombinant EvrF protein was purified to over 90% homogeneity. Thispreparation was fractionated on SDS-PAGE gel, excised, and used toimmunize New Zealand white rabbits to raise antibodies. Antisera weregenerated following standard protocol, i.e., priming with completeFreund's adjuvant, (CFA) and boosting with incomplete Freund's adjuvant(IFA).

Example 5 Everninomicin Pathway Expression of Putative Resistance Genes

Putative everninomicin resistance genes are expressed in theactinomycete vector pSPRH830b. Clones are obtained using standardmolecular biology procedures. Plasmids are transformed into Streptomyceslividans or Streptomyces griseofuscus by PEG protoplast transformationor other standard actinomycete transformation procedures. Transformantsare tested for increased resistance levels to everninomicin. A schematicof pSPRH830 the specific fragments to be cloned into is attached andshown in FIG. 10.

The EV biosynthetic gene DNAs to be expressed by this recombinant vectorare:

-   -   1) 4.9 kb BamHI fragment containing        -   evrB, evrC—membrane pumps similar to mithramycin resistance.    -   2) 9.7 kb HindIII/BamHI fragment containing        -   evbG, evbH—ABC transporter pumps, possible resistance            mechanism.    -   3) 3.0 kb BamHI fragment containing        -   evrE—MDR (Multiple drug resistance-type pump) transporter,            possible resistance mechanism.    -   4) 3.56 kb SacII fragment containing        -   evrY—dehalogenase, possible resistance mechanism        -   evrZ—muramidase/lysozyme homology, possible resistance            mechansim.    -   5) 2.7 kb BamHI fragment containing        -   evrMR—23S rRNA methylase    -   6) A PCR fragment containing        -   evcD and evcE—ABC transporters

Example 6 Insertional Inactivation of EV Pathway Genes

To confirm involvement of evrJ, (orsellinic acid synthetase) evrF,(halogenase) and evrW (dTDP-glucose dehydratase) in EV biosynthesisthese genes were disrupted in M. carbonacea via homologous recombinationusing the conjugative suicide vector pSPRH900b. Internal fragments ofevrJ, evrF, and evrW were cloned into pSPRH900b to yield pSPRX572,pSPRX570, and pSPRX589 respectively. Plasmids pSPRX572, pSPRX570, andpSPRX589 were inserted into the chromosome by conjugation from E. coliinto M. carbonacea to yield strains 572X, 570X and 589X repectively.Southern analysis confirmed insertion into the correct chromosomal locifor each plasmid. 572X, 570X and 589X strains showed a loss of EVproduction as shown by fermentation and analysis by HPLC indicatingthese genes are essential for EV production.

Production and determination of EV production was determined as follows.A mycelia stock of M. carbonacea was inoculated into the seed mediumSIM-1 (10 ml) and incubated at 28° C. and 300 rpm. The seed inoculum (5ml) was then added to 4I+Co production medium (100 ml) and incubated at28° C. and 300 rpm for 96 hours. A 10 ml aliquot of the fermentationbroth was extracted with 20 ml of EtOAc, and the organic phase wasevaporated to dryness. After resuspension in 2 ml of MeOH, 10 ml of theextract was subjected to HPLC analysis on a YMC-pack ODS-A C-18 column(3 mm, 150×4.6 mm, Waters Corporation, Milford, Mass.). The column wasequilibrated with 3 mM tetramethyl ammonium hydroxide (pH to 7.2 withglacial acetic acid) with 70% (vol/vol) MeOH and developed with a 24-minlinear gradient from 70 to 90% MeOH in the same 3 mM tetramethylammonium hydroxide buffer at a flow rate of 0.8 ml/min. EV was detectedat 270 nm by UV-Vis detection using a Agilent Series1100 HPLC system(Agilent Technologies).

The present invention is not to be limited in scope by the specificembodiments described herein. Indeed, various modifications of theinvention in addition to those described herein will become apparent tothose skilled in the art from the foregoing description and theaccompanying figures. Such modifications are intended to fall within thescope of the appended claims.

It is further to be understood that all sizes and all molecular weightor molecular mass values are approximate, and are provided fordescription.

Patents, patent applications, procedures, and publications citedthroughout this application are incorporated herein by reference intheir entireties.

1. An isolated polynucleotide encoding a polypeptide comprising an aminoacid sequence set forth in SEQ ID NO:
 177. 2. The polynucleotide ofclaim 1 which comprises the nucleotide sequence set forth in SEQ ID NO:176.
 3. An isolated vector comprising the polynucleotide of claim
 1. 4.The vector of claim 3 which is a plasmid.
 5. The vector of claim 3further comprising a heterologous gene.
 6. The vector of claim 3 furthercomprising a polynucleotide encoding a polypeptide comprising an aminoacid sequence selected from the group consisting of: 3, 5, 7, 9, 11, 13,15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 49,51, 53, 55, 57, 59, 61, 63, 65, 67, 69, 71, 73, 75, 77, 79, 81, 83, 85,87, 89, 91, 93, 95, 97, 99, 101, 103, 105, 107, 109, 111, 113, 115, 117,119, 121, 123, 125, 127, 129, 131, 133, 135, 137, 139, 141, 143, 145,147, 149, 151, 153, 155, 157, 159, 161, 163, 165, 167, 169, 171, 173,175, 184, 186, 188, 190, 192, 194, 196, 198, 200, 202 and
 204. 7. Thevector of claim 3 further comprising a polynucleotide comprising anucleotide sequence selected from the group consisting of: 2, 4, 6, 8,10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44,46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, 78, 80,82, 84, 86, 88, 90, 92, 94, 96, 98, 100, 102, 104, 106, 108, 110, 112,114, 116, 118, 120, 122, 124, 126, 128, 130, 132, 134, 136, 138, 140,142, 144, 146, 148, 150, 152, 154, 156, 158, 160, 162, 164, 166, 168,170, 172, 174, 183, 185, 187, 189, 191, 193, 195, 197, 199, 201 and 203.8. A host cell containing the vector of claim
 3. 9. The host cell ofclaim 7, which is a bacterial host cell.
 10. The host cell of claim 8,which is an E. coli or an actinomycete.
 11. The polynucleotide of claim1 operably associated with a transcriptional and translational controlsequence.
 12. The polynucleotide of claim 1 encoding a chimericpolypeptide comprising the amino acid sequence set forth in SEQ ID NO:177 fused to a heterologous polypeptide.
 13. The polynucleotide of claim12 wherein the heterologous polypeptide is a member selected from thegroup consisting of a poly-histidine tag, a FLAG tag, aglutathione-S-transferase (GST) tag and a myc epitope tag.
 14. Theisolated polynucleotide of claim 1 consisting of the nucleotide sequenceset forth in SEQ ID NO: 176.